Release v1.6.1 of proxy-init adds support for iptables-nft. This change
bumps up the proxy-init version used in code, chart values, and golden
files.
* Update go.mod dep
* Update CNI plugin with new opts
* Update proxy-init ref in golden files and chart values
* Update policy controller CI workflow
Signed-off-by: Matei David <matei@buoyant.io>
Watch events for objects in the kube-system namespace were previously ignored.
In certain situations, this would cause the destination service to return
invalid (outdated) endpoints for services in kube-system - including unmeshed
services.
It [was suggested][1] that kube-system events were ignored to avoid handling
frequent Endpoint updates - specifically from [controllers using Endpoints for
leader elections][2]. As of Kubernetes 1.20, these controllers [default to using
Leases instead of Endpoints for their leader elections][3], obviating the need
to exclude (or filter) updates from kube-system. The exclusions have been
removed accordingly.
[1]: https://github.com/linkerd/linkerd2/pull/4133#issuecomment-594983588
[2]: https://github.com/kubernetes/kubernetes/issues/86286
[3]: https://github.com/kubernetes/kubernetes/pull/94603
Signed-off-by: Jacob Henner <code@ventricle.us>
Fixes#8592
Increase the minimum supported kubernetes version from 1.20 to 1.21. This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources. This fixes deprecation warnings about these warnings printed by the CLI.
Signed-off-by: Alex Leong <alex@buoyant.io>
The injector configures the proxy with a set of known inbound ports
which are used (by the proxy) to discover inbound server configuration.
The list of ports is derived from the pod's container ports; container
ports may be optional and thus not present. The proxy supports dynamic
discovery of additional ports at runtime but since they are lazy,
additional ports may be dropped or updated long after pod start-up.
To ensure HTTP probes are handled correctly, this change introduces new
functionality to configure the list of inbound ports for the proxy with
any ports targeted by healthcheck probes, as long as they are HTTP, and
even if they are not present in the containerPorts configuration.
This change also introduces additional liveness (or readiness) probes to
the current injector webhook test fixtures in order to assert that
injected pods will always have their healthcheck target ports included
in the proxy's configuration.
Closes#8638
Signed-off-by: Matei David <matei@buoyant.io>
* Fix injector not emitting skip events properly
A resource that cannot be injected -- for a variety of reasons, such as
using the host network, or not mounting a SA token when token projection
is disabled -- may still have an annotation patch if the resource's
endpoints have ports marked as opaque.
When an annotation patch is generated but an injection patch is not, the
injector will not emit an event and consider the resource as "injected",
when in fact it does not have the sidecar. This can make it hard to
investigate issues where resources that are supposed to be injected are
not because the failure is not obvious.
This change refactors and simplifies the logic by emitting an event
whenever a resource is not injected, regardless of whether an annotation
patch is generated for it. This should provide better visibility in case
of failures. Furthermore, the change refactors the code to avoid too
many early returns and make it easier to trace the codepath through the
function. The initial assumption that an annotation patch should not
increment injection skipped admission response has been left intact; in
other words, an annotation patch will not count in the metrics as a
skip, but an event with the skip reason is still emitted.
Fixes#8634
Signed-off-by: Matei David <matei@buoyant.io>
Fixes#8624
When the proxy-injector encounters a resource with an owner ref, it calls `api.GetObjects` to fetch the owner. If the owner is a kind which is not supported by the proxy-injector, we will panic.
We add a condition so that we only attempt to fetch the owner resource if it is a kind we support.
Signed-off-by: Alex Leong <alex@buoyant.io>
Our docker images hardcode a patch version, 1.17.3, which does not
include a variety of important fixes that have been released:
> go1.17.4 (released 2021-12-02) includes fixes to the compiler, linker,
> runtime, and the go/types, net/http, and time packages. See the Go
> 1.17.4 milestone on our issue tracker for details.
> go1.17.5 (released 2021-12-09) includes security fixes to the net/http
> and syscall packages. See the Go 1.17.5 milestone on our issue tracker
> for details.
> go1.17.6 (released 2022-01-06) includes fixes to the compiler, linker,
> runtime, and the crypto/x509, net/http, and reflect packages. See the Go
> 1.17.6 milestone on our issue tracker for details.
> go1.17.7 (released 2022-02-10) includes security fixes to the go
> command, and the crypto/elliptic and math/big packages, as well as bug
> fixes to the compiler, linker, runtime, the go command, and the
> debug/macho, debug/pe, and net/http/httptest packages. See the Go 1.17.7
> milestone on our issue tracker for details.
> go1.17.8 (released 2022-03-03) includes a security fix to the
> regexp/syntax package, as well as bug fixes to the compiler, runtime,
> the go command, and the crypto/x509 and net packages. See the Go 1.17.8
> milestone on our issue tracker for details.
> go1.17.9 (released 2022-04-12) includes security fixes to the
> crypto/elliptic and encoding/pem packages, as well as bug fixes to the
> linker and runtime. See the Go 1.17.9 milestone on our issue tracker for
> details.
> go1.17.10 (released 2022-05-10) includes security fixes to the syscall
> package, as well as bug fixes to the compiler, runtime, and the
> crypto/x509 and net/http/httptest packages. See the Go 1.17.10 milestone
> on our issue tracker for details.
> go1.17.11 (released 2022-06-01) includes security fixes to the
> crypto/rand, crypto/tls, os/exec, and path/filepath packages, as well as
> bug fixes to the crypto/tls package. See the Go 1.17.11 milestone on our
> issue tracker for details.
This changes our container configs to use the latest 1.17 release on
each build so that these patch releases are picked up without manual
intervention.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Fixes#8134
The multicluster probe-gateway service (which is used by the service-mirror controller to send health probes to the remote gateway) has `mirror.linkerd.io/cluster-name` label but it does not have a `mirror.linkerd.io/remote-svc-fq-name` annotation. This makes sense because the probe-gateway service does not correspond to any target service on the remote cluster and instead targets the gateway itself.
We update the logic in the destination controller so that we can still add the `dst_target_cluster` metric label when only the `mirror.linkerd.io/cluster-name` label is present but not the `mirror.linkerd.io/remote-svc-fq-name` annotation. This allows us to have the `dst_target_cluster` metric label for service-mirror controller probe traffic.
Signed-off-by: Alex Leong <alex@buoyant.io>
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Opaque ports may be configured through annotations, where the value may
be either as a range (e.g 0 - 255), or as a comma delimited string
("121,122"). When configured as a comma delimited string, our parsing
logic will trim any leading and trailing commas and split the value;
ports will be converted from string to an int counterpart.
If whitespaces are used, such that the value looks similar to "121,
122", the parsing logic will fail -- when attempting to convert the
string into an integer -- with the following error "\" 122\" is not a
valid lower-bound". This can lead to confusion from users whose services
and endpoints function with undefined behaviour.
This change introduces logic to strip any leading or trailing
whitespaces from strings when tokenizing the annotation value; this way,
we are guaranteed not to experience parsing errors when converting
strings. To validate the behaviour, a new (unit) test has been added to
the opaque ports watcher with a multi-opaque-port annotation on a
service, where the value contains a space.
Signed-off-by: Matei David <matei@buoyant.io>
While looking into #8273, I wanted to confirm that the destination
controller uses the default opaque ports configuration for arbitrary
(unmeshed) IPs.
This change adds a test that exercises resolution behavior for external
IPs.
Signed-off-by: Oliver Gould <ver@buoyant.io>
This change updates the repo to use `linkerd2-proxy-api` v0.4.0 and
updates `bin/protoc` to use v3.20 to match the configuration in the
other repo.
The policy-controller builds are updated to use our `bin/protoc` wrapper
so that all builds go through the same toolchain (and to avoid compiling
protoc on each build).
Signed-off-by: Oliver Gould <ver@buoyant.io>
c1a1430d added new policy CRDs: `AuthoriationPolicy`,
`MeshTLSAuthentication` and `NetworkAuthentiction` with a controller
implemented in Rust.
This change adds Go types for these resources so that they may be
accessed from the CLI, etc.
Co-authored-by: Zahari Dichev zaharidichev@gmail.com
Signed-off-by: Zahari Dichev zaharidichev@gmail.com
Signed-off-by: Oliver Gould <ver@buoyant.io>
The destination controller can improperly handle updates by returning a
map reference instead of a new data structure. This breaks diffing logic,
as newly added endpoints appear to pre-exist.
This change ensures that a fresh data structure is used when handling
discovery updates.
Fixes#8143
Signed-off-by: Alex Leong <alex@buoyant.io>
Follow-up to #8087 that allows pprof to be enabled via the `--set
enablePprof=true` flag.
Each control plane components spawns its own admin server, so each of these
received it's own `enable-pprof` flag. When `enablePprof=true`, it is passed
through to each component so that when it launches its admin server, its pprof
endpoints are enabled.
A note on the templating: `-enable-pprof={{.Values.enablePprof | default
false}}`. `false` values are not rendered by Helm so without the `... | default
false}}`, it tries to pass the flag as `-enable-pprof=""` which results in an
error. Inlining this felt better than conditionally passing the flag with
```yaml {{ if .Values.enablePprof -}} -enable-pprof={{.Values.enablePprof}} {{
end -}} ```
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Closes#7881
This makes the rest of the necessary fixes to satisfy the `staticcheck` lint.
The only class of lints that are being skipped are those related to deprecated tap code. There was some discussion on the original change started by @adleong about if this _actually_ deprecated [here](https://github.com/linkerd/linkerd2/pull/3240#discussion_r313634584); it doesn't look like we every came back around to fully removing it but I don't think it should be a blocker for enabling the lint right now.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Closes#7980
A pod is considered `Burstable` instead of `Guaranteed` if there exists at least one container in the pod that specifies CPU/memory limits/requests that do not match.
The `linkerd-init` container falls into this category meaning that even if all other containers in a Pod have matching CPU/memory limits/requests, the Pod will not be considered `Guaranteed` because of `linkerd-init`'s hardcoded values.
This changes the values to match, meaning that `linkerd-init` will not be the culprit container if a Pod is not considered `Guaranteed`. Raising the requests—instead of lowering the limits—felt like the safer option here. This means that the container will now always be guaranteed these amounts _and_ will never use more.
[Docs](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/#create-a-pod-that-gets-assigned-a-qos-class-of-guaranteed) explain this in more detail.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Closes#7826
This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.
A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.
Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.
Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Fixes#7904
Allow the `Server` CRD to have the `PodSelector` entry be an empty object, by removing the `omitempty` tag from its go type definition and the `oneof` section in the CRD. No update to the CRD version is required, as this is BC change -- The CRD overriding was tested fine.
Also added some unit tests to confirm podSelector conditions are ANDed, and some minor refactorings in the `Selector` constructors.
Co-authored-by: Oliver Gould <ver@buoyant.io>
[gocritic][gc] helps to enforce some consistency and check for potential
errors. This change applies linting changes and enables gocritic via
golangci-lint.
[gc]: https://github.com/go-critic/go-critic
Signed-off-by: Oliver Gould <ver@buoyant.io>
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.
This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).
[el]: https://github.com/polyfloyd/go-errorlint
Signed-off-by: Oliver Gould <ver@buoyant.io>
When the webhook server decodes a request's JSON payload, it may try to
use a nil value when handling the error. Furthermore, if the JSON
payload has a `nil` `Request` value, it may attempt to dereference the
value.
This change improves the webhook server's error handling to return a
`400 Bad Request` status if either of these cases are encountered.
Signed-off-by: Oliver Gould <ver@buoyant.io>
In several places where we build TLS servers (usually in our webhooks),
we use the default TLS configuration, which enables legacy versions of
TLS.
This change updates these servers to specify a minimum TLS version of
v1.2.
Signed-off-by: Oliver Gould <ver@buoyant.io>
We use `fmt.Sprintf` to format URIs in several places we could be using
`net.JoinHostPort`. `net.JoinHostPort` ensures that IPv6 addresses are
properly escaped in URIs.
Signed-off-by: Oliver Gould <ver@buoyant.io>
If the proxy doesn't become ready `linkerd-await` never succeeds
and the proxy's logs don't become accessible.
This change adds a default 2 minute timeout so that pod startup
continues despite the proxy failing to become ready. `linkerd-await`
fails and `kubectl` will report that a post start hook failed.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
* Remove the `proxy.disableIdentity` config
Fixes#7724
Also:
- Removed the `linkerd.io/identity-mode` annotation.
- Removed the `config.linkerd.io/disable-identity` annotation.
- Removed the `linkerd.proxy.validation` template partial, which only
made sense when `proxy.disableIdentity` was `true`.
- TestInjectManualParams now requires to hit the cluster to retrieve the
trust root.
Fixes#6337
GetProfile can be called with a FQDN for a specific member of a service e.g.
```
web-0.foo.ns.svc.cluster.local
```
If that endpoint is not backed by a pod, `GetProfile` will not return an endpoint in the response.
We update the logic to return an endpoint in the response even when the endpoint is not backed by a pod.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#6584#6620#7405
# Namespace Removal
With this change, the `namespace.yaml` template is rendered only for CLI installs and not Helm, and likewise the `namespace:` entry in the namespace-level objects (using a new `partials.namespace` helper).
The `installNamespace` and `namespace` entries in `values.yaml` have been removed.
There in the templates where the namespace is required, we moved from `.Values.namespace` to `.Release.Namespace` which is filled-in automatically by Helm. For the CLI, `install.go` now explicitly defines the contents of the `Release` map alongside `Values`.
The proxy-injector has a new `linkerd-namespace` argument given the namespace is no longer persisted in the `linkerd-config` ConfigMap, so it has to be passed in. To pass it further down to `injector.Inject()` without modifying the `Handler` signature, a closure was used.
------------
Update: Merged-in #6638: Similar changes for the `linkerd-viz` chart:
Stop rendering `namespace.yaml` in the `linkerd-viz` chart.
The additional change here is the addition of the `namespace-metadata.yaml` template (and its RBAC), _not_ rendered in CLI installs, which is a Helm `post-install` hook, consisting on a Job that executes a script adding the required annotations and labels to the viz namespace using a PATCH request against kube-api. The script first checks if the namespace doesn't already have an annotations/labels entries, in which case it has to add extra ops in that patch.
---------
Update: Merged-in the approved #6643, #6665 and #6669 which address the `linkerd2-cni`, `linkerd-multicluster` and `linkerd-jaeger` charts.
Additional changes from what's already mentioned above:
- Removes the install-namespace option from `linkerd install-cni`, which isn't found in `linkerd install` nor `linkerd viz install` anyways, and it would add some complexity to support.
- Added a dependency on the `partials` chart to the `linkerd-multicluster-link` chart, so that we can tap on the `partials.namespace` helper.
- We don't have any more the restriction on having the muticluster objects live in a separate namespace than linkerd. It's still good practice, and that's the default for the CLI install, but I removed that validation.
Finally, as a side-effect, the `linkerd mc allow` subcommand was fixed; it has been broken for a while apparently:
```console
$ linkerd mc allow --service-account-name foobar
Error: template: linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml:16:7: executing "linkerd-multicluster/templates/remote-access-service-mirror-rbac.yaml" at <include "partials.annotations.created-by" $>: error calling include: template: no template "partials.annotations.created-by" associated with template "gotpl"
```
---------
Update: see helm/helm#5465 describing the current best-practice
# Core Helm Charts Split
This removes the `linkerd2` chart, and replaces it with the `linkerd-crds` and `linkerd-control-plane` charts. Note that the viz and other extension charts are not concerned by this change.
Also note the original `values.yaml` file has been split into both charts accordingly.
### UX
```console
$ helm install linkerd-crds --namespace linkerd --create-namespace linkerd/linkerd-crds
...
# certs.yaml should contain identityTrustAnchorsPEM and the identity issuer values
$ helm install linkerd-control-plane --namespace linkerd -f certs.yaml linkerd/linkerd-control-plane
```
### Upgrade
As explained in #6635, this is a breaking change. Users will have to uninstall the `linkerd2` chart and install these two, and eventually rollout the proxies (they should continue to work during the transition anyway).
### CLI
The CLI install/upgrade code was updated to be able to pick the templates from these new charts, but the CLI UX remains identical as before.
### Other changes
- The `linkerd-crds` and `linkerd-control-plane` charts now carry a version scheme independent of linkerd's own versioning, as explained in #7405.
- These charts are Helm v3, which is reflected in the `Chart.yaml` entries and in the removal of the `requirements.yaml` files.
- In the integration tests, replaced the `helm-chart` arg with `helm-charts` containing the path `./charts`, used to build the paths for both charts.
### Followups
- Now it's possible to add a `ServiceProfile` instance for Destination in the `linkerd-control-plane` chart.
### What
`GetProfile` clients do not receive destinatin profiles that consider Server protocol fields the way that `Get` clients do. If a Server exists for a `GetProfile` destination that specifies the protocol for that destination is `opaque`, this information is not passed back to the client.
#7184 added this for `Get` by subscribing clients to Endpoint/EndpointSlice updates. When there is an update, or there is a Server update, the endpoints watcher passes this information back to the endpoint translator which handles sending the update back to the client.
For `GetProfile` the situation is different. As with `Get`, we only consider Servers when dealing with Pod IPs, but this only occurs in two situations for `GetProfile`.
1. The destination is a Pod IP and port
2. The destionation is an Instance ID and port
In both of these cases, we need to check if a already Server selects the endpoint and we need to subscribe for Server updates incase one is added or deleted which selects the endpoint.
### How
First we check if there is already a Server which selects the endpoint. This is so that when the first destionation profile is returned, the client knows if the destination is `opaque` or not.
After sending that first update, we then subscribe the client for any future updates which will come from a Server being added or deleted.
This is handled by the new `ServerWatcher` which watches for Server updates on the cluster; when an update occurs it sends that to the `endpointProfileTranslator` which translates the protcol update into a DestinationProfile.
By introducing the `endpointProfileTranslator` which only handles protocol updates, we're able to decouple the endpoint logic from `profileTranslator`—it's `endpoint` field has been removed now that it only handles updates for ServiceProfiles for Services.
### Testing
A unit test has been added and below are some manual testing instructions to see how it interacts with Server updates:
<details>
<summary>app.yaml</summary>
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
labels:
app: pod
spec:
containers:
- name: app
image: nginx
ports:
- name: http
containerPort: 80
---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
name: srv
labels:
policy: srv
spec:
podSelector:
matchLabels:
app: pod
port: 80
proxyProtocol: opaque
```
</details>
```shell
$ go run ./controller/cmd/main.go destination
```
```shell
$ linkerd inject app.yaml |kubectl apply -f -
...
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod 2/2 Running 0 53m 10.42.0.34 k3d-k3s-default-server-0 <none> <none>
$ go run ./controller/script/destination-client/main.go -method getProfile -path 10.42.0.34:80
...
```
You can add/delete `srv` as well as edit its `proxyProtocol` field to observe the correct DestinationProfile updates.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
The destination API expects a json formatted context token which includes the node name of the requesting client. e.g.
```
go run controller/script/destination-client/main.go -method get -path web-svc.emojivoto.svc.cluster.local:80 --token '{"nodeName":"alex-worker"}'
```
Signed-off-by: Alex Leong <alex@buoyant.io>
Now, that SMI functionality is fully being moved into the
[linkerd-smi](www.github.com/linkerd/linkerd-smi) extension, we can
stop supporting its functionality by default.
This means that the `destination` component will stop reacting
to the `TrafficSplit` objects. When `linkerd-smi` is installed,
It does the conversion of `TrafficSplit` objects to `ServiceProfiles`
that destination components can understand, and will react accordingly.
Also, Whenever a `ServiceProfile` with traffic splitting is associated
with a service, the same information (i.e splits and weights) is also
surfaced through the `UI` (in the new `services` tab) and the `viz cmd`.
So, We are not really loosing any UI functionality here.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
EndpointSlices are enabled by default in our Kubernetes minimum version of 1.20. Thus we can change the default behavior of the destination controller to use EndpointSlices instead of Endpoints. This unblocks any functionality which is specific to EndpointSlices such as topology aware hints.
Signed-off-by: Alex Leong <alex@buoyant.io>
* build: upgrade to Go 1.17
This commit introduces three changes:
1. Update the `go` directive in `go.mod` to 1.17
2. Update all Dockerfiles from `golang:1.16.2` to
`golang:1.17.3`
3. Update all CI to use Go 1.17
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* chore: run `go fmt ./...`
This commit synchronizes `//go:build` lines with `// +build` lines.
Reference: https://go.googlesource.com/proposal/+/master/design/draft-gobuild.md
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
This change ensures that if a Server exists with `proxyProtocol: opaque` that selects an endpoint backed by a pod, that destination requests for that pod reflect the fact that it handles opaque traffic.
Currently, the only way that opaque traffic is honored in the destination service is if the pod has the `config.linkerd.io/opaque-ports` annotation. With the introduction of Servers though, users can set `server.Spec.ProxyProtocol: opaque` to indicate that if a Server selects a pod, then traffic to that pod's `server.Spec.Port` should be opaque. Currently, the destination service does not take this into account.
There is an existing change up that _also_ adds this functionality; it takes a different approach by creating a policy server client for each endpoint that a destination has. For `Get` requests on a service, the number of clients scales with the number of endpoints that back that service.
This change fixes that issue by instead creating a Server watch in the endpoint watcher and sending updates through to the endpoint translator.
The two primary scenarios to consider are
### A `Get` request for some service is streaming when a Server is created/updated/deleted
When a Server is created or updated, the endpoint watcher iterates through its endpoint watches (`servicePublisher` -> `portPublisher`) and if it selects any of those endpoints, the port publisher sends an update if the Server has marked that port as opaque.
When a Server is deleted, the endpoint watcher once again iterates through its endpoint watches and deletes the address set's `OpaquePodPorts` field—ensuring that updates have been cleared of Server overrides.
### A `Get` request for some service happens after a Server is created
When a `Get` request occurs (or new endpoints are added—they both take the same path), we must check if any of those endpoints are selected by some existing Server. If so, we have to take that into account when creating the address set.
This part of the change gives me a little concern as we first must get all the Servers on the cluster and then create a set of _all_ the pod-backed endpoints that they select in order to determine if any of these _new_ endpoints are selected.
## Testing
Right now this can be tested by starting up the destination service locally and running `Get` requests on a service that has endpoints selected by a Server
**app.yaml**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
labels:
app: pod
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: svc
spec:
selector:
app: pod
ports:
- name: http
port: 80
---
apiVersion: policy.linkerd.io/v1alpha1
kind: Server
metadata:
name: srv
labels:
policy: srv
spec:
podSelector:
matchLabels:
app: pod
port: 80
proxyProtocol: HTTP/1
```
```bash
$ go run controller/script/destination-client/main.go -path svc.default.svc.cluster.local:80
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Currently, the MC `Link` CRD is being handled using the dynamic k8s client. It would be useful for consumers of this API if there was a typed API for this CRD.
The solution is to update the codegen script to generate this code.
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
In our chart values and (some) integration tests, we're using a deprecated
label for node selection. According to the warning messages we get during
installation, the label has been deprecated since k8s `v1.14`:
```
Warning: spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
Warning: spec.jobTemplate.spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
```
This PR changes all occurrences of `beta.kubernetes.io/node` with
`kubernetes.io/node`.
Fixes#7225
When the `mirror.linkerd.io/remote-gateway-identity` and `mirror.linkerd.io/remote-svc-fq-name` annotations are set on an EndpointSlice object, the destination controller does not return the correct identity hints for that endpoint.
We fix an incorrect assignment to fix this. We also fix some logic that can result in a nil pointer dereference instead of comparing empty strings. We add a test case to exercise these.
Signed-off-by: Alex Leong <alex@buoyant.io>
## background
In order to upgrade `client-go` and other related libraries to `v0.22.0`, we had to address the deprecation of the service's `TopologyKeys` field. This field and it's related feature have been deprecated and superseded by [Topology Aware Hints](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2433-topology-aware-hints/README.md).
The goal of topology aware hints is to to provide a simpler way for users to prefer endpoints by basing decisions soely off the node's `topology.kubernetes.io/zone` label. If a node is in `zone-a`, then it should prefer endpoints that _should_ be consumed by clients in `zone-a`.
kube-proxy (and now the destination controller) know that an endpoint _should_ be consumed by clients in certain zones if its `Hints.ForZones` field is set with a zone value that matches that of the client.
For example, the endpoint slice controller may add the following hint to an endpoint:
```
- addresses: ["1.1.1.1"]
zone: "zone-a"
hints:
zone: "zone-b"
```
The above endpoint is an endpoint that is located in `zone-a` but should be consumed by clients in `zone-b`.
## changes
Now that topological preference is not a concept, we can remove it from the `servicePublisher` and `portPublisher` structs. The fields were only there so that it could be populated down to individual addresses.
The `Hints` field is only present on endpoints that belong to an `EndpointSlice`, so use of this field is limited to the `endpointSliceToAddresses` function.
When endpoint slices are translated to an `AddressSet` now, for each address (endpoint) we make sure to copy the `Hints.ForZones` field if it is present. This field is only present if it's set by the endpoint slice controller and it has [several safeguards](https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/#safeguards).
After `endpointSliceToAddresses` has translated an endpoint slice into an `AddressSet` and updated the endpoint translator's `availableEndpoints`, filtering takes place and is the crux of this change.
For each potential address that we have to consider in `availableEndpoints`, we make sure to only return a set of addresses who's consumption zone (zones in `forZones` field) match that of the node's zone. That way, we only communicate with endpoints that have been labeled by the endpoint slice controller for the current node we're on.
This allows us to remove the ordering/hierarchy of topological region and considering the `*` value.
## testing
I've added a unit test which creates an endpoint translator tied to a node in `west-1a` and asserts that it only handles updates for addresses that should be consumed by clients in `west-1a`.
Closes#6637
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Linkerd proxy-init container is currently enforced to run as root.
Removes hardcoding `runAsNonRoot: false` and `runAsUser: 0`. This way
the container inherits the user ID from the proxy-init image instead which
may allow to run as non-root.
Fixes#5505
Signed-off-by: Schlotter, Christian <christian.schlotter@daimler.com>
Fixes#3260
## Summary
Currently, Linkerd uses a service Account token to validate a pod
during the `Certify` request with identity, through which identity
is established on the proxy. This works well and good, as Kubernetes
attaches the `default` service account token of a namespace as a volume
(unless overridden with a specific service account by the user). Catch
here being that this token is aimed at the application to talk to the
kubernetes API and not specifically for Linkerd. This means that there
are [controls outside of Linkerd](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#use-the-default-service-account-to-access-the-api-server), to manage this service token, which
users might want to use, [causing problems with Linkerd](https://github.com/linkerd/linkerd2/issues/3183)
as Linkerd might expect it to be present.
To have a more granular control over the token, and not rely on the
service token that can be managed externally, [Bound Service Tokens](https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/1205-bound-service-account-tokens)
can be used to generate tokens that are specifically for Linkerd,
that are bound to a specific pod, along with an expiry.
## Background on Bounded Service Tokens
This feature has been GA’ed in Kubernetes 1.20, and is enabled by default
in most cloud provider distributions. Using this feature, Kubernetes can
be asked to issue specific tokens for linkerd usage (through audience bound
configuration), with a specific expiry time (as the validation happens every
24 hours when establishing identity, we can follow the same), bounded to
a specific pod (meaning verification fails if the pod object isn’t available).
Because of all these bounds, and not being able to use this token for
anything else, This feels like the right thing to rely on to validate
a pod to issue a certificate.
### Pod Identity Name
We still use the same service account name as the pod identity
(used with metrics, etc) as these tokens are all generated from the
same base service account attached to the pod (could be defualt, or
the user overriden one). This can be verified by looking at the `user`
field in the `TokenReview` response.
<details>
<summary>Sample TokenReview response</summary>
Here, The new token was created for the vault audience for a pod which
had a serviceAccount token volume projection and was using the `mine`
serviceAccount in the default namespace.
```json
"kind": "TokenReview",
"apiVersion": "authentication.k8s.io/v1",
"metadata": {
"creationTimestamp": null,
"managedFields": [
{
"manager": "curl",
"operation": "Update",
"apiVersion": "authentication.k8s.io/v1",
"time": "2021-10-19T19:21:40Z",
"fieldsType": "FieldsV1",
"fieldsV1": {"f:spec":{"f:audiences":{},"f:token":{}}}
}
]
},
"spec": {
"token": "....",
"audiences": [
"vault"
]
},
"status": {
"authenticated": true,
"user": {
"username": "system:serviceaccount:default:mine",
"uid": "889a81bd-e31c-4423-b542-98ddca89bfd9",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:default",
"system:authenticated"
],
"extra": {
"authentication.kubernetes.io/pod-name": [
"nginx"
],
"authentication.kubernetes.io/pod-uid": [
"ebf36f80-40ee-48ee-a75b-96dcc21466a6"
]
}
},
"audiences": [
"vault"
]
}
```
</details>
## Changes
- Update `proxy-injector` and install scripts to include the new
projected Volume and VolumeMount.
- Update the `identity` pod to validate the token with the linkerd
audience key.
- Added `identity.serviceAccountTokenProjection` to disable this
feature.
- Updated err'ing logic with `autoMountServiceAccount: false`
to fail only when this feature is disabled.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>