* build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0
Bumps [sigs.k8s.io/gateway-api](https://github.com/kubernetes-sigs/gateway-api) from 0.5.1 to 0.6.0.
- [Release notes](https://github.com/kubernetes-sigs/gateway-api/releases)
- [Changelog](https://github.com/kubernetes-sigs/gateway-api/blob/main/CHANGELOG.md)
- [Commits](https://github.com/kubernetes-sigs/gateway-api/compare/v0.5.1...v0.6.0)
---
updated-dependencies:
- dependency-name: sigs.k8s.io/gateway-api
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* Account for possible errors returned from `AddEventHandler`
In v0.26.0 client-go's `AddEventHandler` method for informers started
returning a registration handle (that we ignore) and an error that we
now surface up.
* client-go v0.26.0 removed the openstack plugin
* Temporary changes to trigger tests in k8s 1.21
- Adds an innocuous change to integration.yml so that all tests get
triggered
- Hard-code k8s version in `k3d cluster create` invocation to v1.21
* Revert "Temporary changes to trigger tests in k8s 1.21"
This reverts commit 3e1fdd0e5e.
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
* Removed dupe imports
My IDE (vim-gopls) has been complaining for a while, so I decided to take
care of it. Found via
[staticcheck](https://github.com/dominikh/go-tools)
* Add stylecheck to go-lint checks
Fixes#10003
When endpoints are removed from an EndpointSlice resource, the destination controller builds a list of addresses to remove. However, if any of the removed endpoints have a Pod as their targetRef, we will attempt to fetch that pod to build the address to remove. If that pod has already been removed from the informer cache, this will fail and the endpoint will be skipped in the list of endpoints to be removed. This results in stale endpoints being stuck in the address set and never being removed.
We update the endpoint watcher to construct only a list of endpoint IDs for endpoints to remove, rather than fetching the entire pod object. Since we no longer attempt to fetch the pod, this operation is now infallible and endpoints will no longer be skipped during removal.
We also add a `TestEndpointSliceScaleDown` test to exercise this.
Signed-off-by: Alex Leong <alex@buoyant.io>
When performing the HostPort mapping introduced in #9819, the `containsIP` iterates through the pod IPs searching for a match against `targetIP` using `ip.String()`, but that returns something like `&PodIP{IP: xxx}`. Fixed that to just use `ip.IP`, and also completed the text fixtures to include both `PodIP` and `PodIPs` in the pods manifests.
Note this wasn't affecting the end result, it was just producing an extra warning as shown below, that this change eliminates:
```bash
$ go test -v ./controller/api/destination/... -run TestGetProfiles
=== RUN TestGetProfiles
...
=== RUN TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS
time="2022-11-29T09:38:48-05:00" level=info msg="waiting for caches to sync"
time="2022-11-29T09:38:49-05:00" level=info msg="caches synced"
time="2022-11-29T09:38:49-05:00" level=warning msg="unable to find container port as host (172.17.13.15) matches neither PodIP nor HostIP (&Pod{ObjectMeta:{pod-0 ns 0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[linkerd.io/control-plane-ns:linkerd] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,},Status:PodStatus{Phase:Running,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:172.17.13.15,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},})" test=TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS
```
Fixes#9896
The maps in `endpointTranslator` weren't being guarded against
concurrent access, so we're adding locks at the `Add` and `Remove`
methods. Also these functions ultimately call the `SendMsg` method on
the gRPC `stream`, which is not
["thread-safe"](https://github.com/grpc/grpc-go/blob/master/stream.go#L122-L126),
so we're guarding against other problems as well.
A new unit test `TestConcurrency` was added that failed in the following
ways before this fix:
When running the test with the `-race` flag, we immediately get the data race warning:
```bash
$ go test ./controller/api/destination/... -run TestConcurrency -race
time="2022-11-25T16:48:52-05:00" level=info msg="waiting for caches to sync"
time="2022-11-25T16:48:52-05:00" level=info msg="caches synced"
==================
WARNING: DATA RACE
Read at 0x00c0000c0040 by goroutine 161:
github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80 +0x29c
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92
Previous write at 0x00c0000c0040 by goroutine 162:
github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).sendFilteredUpdate()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:95 +0x66
github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:83 +0x330
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92
Goroutine 161 (running) created at:
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f
testing.tRunner()
/usr/local/go/src/testing/testing.go:1439 +0x213
testing.(*T).Run.func1()
/usr/local/go/src/testing/testing.go:1486 +0x47
Goroutine 162 (running) created at:
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f
testing.tRunner()
/usr/local/go/src/testing/testing.go:1439 +0x213
testing.(*T).Run.func1()
/usr/local/go/src/testing/testing.go:1486 +0x47
```
If run without the `-race` flag, we get the `concurrent map writes` panic reported in #9896:
```bash
$ go test ./controller/api/destination/... -run TestConcurrency -count=1
time="2022-11-25T16:53:25-05:00" level=info msg="waiting for caches to sync"
time="2022-11-25T16:53:25-05:00" level=info msg="caches synced"
fatal error: concurrent map writes
goroutine 187 [running]:
runtime.throw({0x1b57bc4?, 0x500000000000000?})
/usr/local/go/src/runtime/panic.go:992 +0x71 fp=0xc00013dc80 sp=0xc00013dc50 pc=0x43a5b1
runtime.mapassign(0xc00013dec8?, 0x2?, 0x0?)
/usr/local/go/src/runtime/map.go:595 +0x4d6 fp=0xc00013dd00 sp=0xc00013dc80 pc=0x4113b6
github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add(...)
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x1a8 fp=0xc00013dfe0 sp=0xc00013dd00 pc=0x16d1da8
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00013dfe8 sp=0xc00013dfe0 pc=0x46d721
created by github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency
/home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x3c
```
Maps the request port to the container's port if the request comes in from the node network and has a hostPort mapping.
Problem:
When a request for a container comes in from the node network, the node port is used ignoring the hostPort mapping.
Solution:
When a request is seen coming from the node network, get the container Port from the Spec.
Validation:
Fixed an existing unit test and wrote a new one driving GetProfile specifically.
Fixes#9677
Signed-off-by: Steve Jenson <stevej@buoyant.io>
Closes#9141
This introduces the `--destination-pod` flag to the `linkerd diagnostics
endpoints` command which allows users to target a specific destination Pod when
there are multiple running in a cluster.
This can be useful for issues like #8956, where Linkerd HA is installed and
there seem to be stale endpoints in the destination service. Being able to run
this command and identity which destination Pod (if not all) have an incorrect
view of the cluster.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
The proxy-init repo is changing its structure and, as such, we want to
minimize cross-repo dependencies from linkerd2 to linkerd2-proxy-init.
(We expect the cni-plugin code to move in a followup change).
This change duplicates the port range parsing utility (about 50 lines,
plus tests). This avoids stray dependencies on linkerd2-proxy-init.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Watch events for objects in the kube-system namespace were previously ignored.
In certain situations, this would cause the destination service to return
invalid (outdated) endpoints for services in kube-system - including unmeshed
services.
It [was suggested][1] that kube-system events were ignored to avoid handling
frequent Endpoint updates - specifically from [controllers using Endpoints for
leader elections][2]. As of Kubernetes 1.20, these controllers [default to using
Leases instead of Endpoints for their leader elections][3], obviating the need
to exclude (or filter) updates from kube-system. The exclusions have been
removed accordingly.
[1]: https://github.com/linkerd/linkerd2/pull/4133#issuecomment-594983588
[2]: https://github.com/kubernetes/kubernetes/issues/86286
[3]: https://github.com/kubernetes/kubernetes/pull/94603
Signed-off-by: Jacob Henner <code@ventricle.us>
Fixes#8592
Increase the minimum supported kubernetes version from 1.20 to 1.21. This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources. This fixes deprecation warnings about these warnings printed by the CLI.
Signed-off-by: Alex Leong <alex@buoyant.io>
Fixes#8134
The multicluster probe-gateway service (which is used by the service-mirror controller to send health probes to the remote gateway) has `mirror.linkerd.io/cluster-name` label but it does not have a `mirror.linkerd.io/remote-svc-fq-name` annotation. This makes sense because the probe-gateway service does not correspond to any target service on the remote cluster and instead targets the gateway itself.
We update the logic in the destination controller so that we can still add the `dst_target_cluster` metric label when only the `mirror.linkerd.io/cluster-name` label is present but not the `mirror.linkerd.io/remote-svc-fq-name` annotation. This allows us to have the `dst_target_cluster` metric label for service-mirror controller probe traffic.
Signed-off-by: Alex Leong <alex@buoyant.io>
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.
Signed-off-by: Oliver Gould <ver@buoyant.io>
Opaque ports may be configured through annotations, where the value may
be either as a range (e.g 0 - 255), or as a comma delimited string
("121,122"). When configured as a comma delimited string, our parsing
logic will trim any leading and trailing commas and split the value;
ports will be converted from string to an int counterpart.
If whitespaces are used, such that the value looks similar to "121,
122", the parsing logic will fail -- when attempting to convert the
string into an integer -- with the following error "\" 122\" is not a
valid lower-bound". This can lead to confusion from users whose services
and endpoints function with undefined behaviour.
This change introduces logic to strip any leading or trailing
whitespaces from strings when tokenizing the annotation value; this way,
we are guaranteed not to experience parsing errors when converting
strings. To validate the behaviour, a new (unit) test has been added to
the opaque ports watcher with a multi-opaque-port annotation on a
service, where the value contains a space.
Signed-off-by: Matei David <matei@buoyant.io>
While looking into #8273, I wanted to confirm that the destination
controller uses the default opaque ports configuration for arbitrary
(unmeshed) IPs.
This change adds a test that exercises resolution behavior for external
IPs.
Signed-off-by: Oliver Gould <ver@buoyant.io>
The destination controller can improperly handle updates by returning a
map reference instead of a new data structure. This breaks diffing logic,
as newly added endpoints appear to pre-exist.
This change ensures that a fresh data structure is used when handling
discovery updates.
Fixes#8143
Signed-off-by: Alex Leong <alex@buoyant.io>
Closes#7881
This makes the rest of the necessary fixes to satisfy the `staticcheck` lint.
The only class of lints that are being skipped are those related to deprecated tap code. There was some discussion on the original change started by @adleong about if this _actually_ deprecated [here](https://github.com/linkerd/linkerd2/pull/3240#discussion_r313634584); it doesn't look like we every came back around to fully removing it but I don't think it should be a blocker for enabling the lint right now.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Closes#7826
This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.
A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.
Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.
Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
[gocritic][gc] helps to enforce some consistency and check for potential
errors. This change applies linting changes and enables gocritic via
golangci-lint.
[gc]: https://github.com/go-critic/go-critic
Signed-off-by: Oliver Gould <ver@buoyant.io>
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.
This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).
[el]: https://github.com/polyfloyd/go-errorlint
Signed-off-by: Oliver Gould <ver@buoyant.io>
We use `fmt.Sprintf` to format URIs in several places we could be using
`net.JoinHostPort`. `net.JoinHostPort` ensures that IPv6 addresses are
properly escaped in URIs.
Signed-off-by: Oliver Gould <ver@buoyant.io>
* Remove the `proxy.disableIdentity` config
Fixes#7724
Also:
- Removed the `linkerd.io/identity-mode` annotation.
- Removed the `config.linkerd.io/disable-identity` annotation.
- Removed the `linkerd.proxy.validation` template partial, which only
made sense when `proxy.disableIdentity` was `true`.
- TestInjectManualParams now requires to hit the cluster to retrieve the
trust root.
Fixes#6337
GetProfile can be called with a FQDN for a specific member of a service e.g.
```
web-0.foo.ns.svc.cluster.local
```
If that endpoint is not backed by a pod, `GetProfile` will not return an endpoint in the response.
We update the logic to return an endpoint in the response even when the endpoint is not backed by a pod.
Signed-off-by: Alex Leong <alex@buoyant.io>
### What
`GetProfile` clients do not receive destinatin profiles that consider Server protocol fields the way that `Get` clients do. If a Server exists for a `GetProfile` destination that specifies the protocol for that destination is `opaque`, this information is not passed back to the client.
#7184 added this for `Get` by subscribing clients to Endpoint/EndpointSlice updates. When there is an update, or there is a Server update, the endpoints watcher passes this information back to the endpoint translator which handles sending the update back to the client.
For `GetProfile` the situation is different. As with `Get`, we only consider Servers when dealing with Pod IPs, but this only occurs in two situations for `GetProfile`.
1. The destination is a Pod IP and port
2. The destionation is an Instance ID and port
In both of these cases, we need to check if a already Server selects the endpoint and we need to subscribe for Server updates incase one is added or deleted which selects the endpoint.
### How
First we check if there is already a Server which selects the endpoint. This is so that when the first destionation profile is returned, the client knows if the destination is `opaque` or not.
After sending that first update, we then subscribe the client for any future updates which will come from a Server being added or deleted.
This is handled by the new `ServerWatcher` which watches for Server updates on the cluster; when an update occurs it sends that to the `endpointProfileTranslator` which translates the protcol update into a DestinationProfile.
By introducing the `endpointProfileTranslator` which only handles protocol updates, we're able to decouple the endpoint logic from `profileTranslator`—it's `endpoint` field has been removed now that it only handles updates for ServiceProfiles for Services.
### Testing
A unit test has been added and below are some manual testing instructions to see how it interacts with Server updates:
<details>
<summary>app.yaml</summary>
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
labels:
app: pod
spec:
containers:
- name: app
image: nginx
ports:
- name: http
containerPort: 80
---
apiVersion: policy.linkerd.io/v1beta1
kind: Server
metadata:
name: srv
labels:
policy: srv
spec:
podSelector:
matchLabels:
app: pod
port: 80
proxyProtocol: opaque
```
</details>
```shell
$ go run ./controller/cmd/main.go destination
```
```shell
$ linkerd inject app.yaml |kubectl apply -f -
...
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod 2/2 Running 0 53m 10.42.0.34 k3d-k3s-default-server-0 <none> <none>
$ go run ./controller/script/destination-client/main.go -method getProfile -path 10.42.0.34:80
...
```
You can add/delete `srv` as well as edit its `proxyProtocol` field to observe the correct DestinationProfile updates.
Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
Now, that SMI functionality is fully being moved into the
[linkerd-smi](www.github.com/linkerd/linkerd-smi) extension, we can
stop supporting its functionality by default.
This means that the `destination` component will stop reacting
to the `TrafficSplit` objects. When `linkerd-smi` is installed,
It does the conversion of `TrafficSplit` objects to `ServiceProfiles`
that destination components can understand, and will react accordingly.
Also, Whenever a `ServiceProfile` with traffic splitting is associated
with a service, the same information (i.e splits and weights) is also
surfaced through the `UI` (in the new `services` tab) and the `viz cmd`.
So, We are not really loosing any UI functionality here.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
This change ensures that if a Server exists with `proxyProtocol: opaque` that selects an endpoint backed by a pod, that destination requests for that pod reflect the fact that it handles opaque traffic.
Currently, the only way that opaque traffic is honored in the destination service is if the pod has the `config.linkerd.io/opaque-ports` annotation. With the introduction of Servers though, users can set `server.Spec.ProxyProtocol: opaque` to indicate that if a Server selects a pod, then traffic to that pod's `server.Spec.Port` should be opaque. Currently, the destination service does not take this into account.
There is an existing change up that _also_ adds this functionality; it takes a different approach by creating a policy server client for each endpoint that a destination has. For `Get` requests on a service, the number of clients scales with the number of endpoints that back that service.
This change fixes that issue by instead creating a Server watch in the endpoint watcher and sending updates through to the endpoint translator.
The two primary scenarios to consider are
### A `Get` request for some service is streaming when a Server is created/updated/deleted
When a Server is created or updated, the endpoint watcher iterates through its endpoint watches (`servicePublisher` -> `portPublisher`) and if it selects any of those endpoints, the port publisher sends an update if the Server has marked that port as opaque.
When a Server is deleted, the endpoint watcher once again iterates through its endpoint watches and deletes the address set's `OpaquePodPorts` field—ensuring that updates have been cleared of Server overrides.
### A `Get` request for some service happens after a Server is created
When a `Get` request occurs (or new endpoints are added—they both take the same path), we must check if any of those endpoints are selected by some existing Server. If so, we have to take that into account when creating the address set.
This part of the change gives me a little concern as we first must get all the Servers on the cluster and then create a set of _all_ the pod-backed endpoints that they select in order to determine if any of these _new_ endpoints are selected.
## Testing
Right now this can be tested by starting up the destination service locally and running `Get` requests on a service that has endpoints selected by a Server
**app.yaml**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pod
labels:
app: pod
spec:
containers:
- name: app
image: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: svc
spec:
selector:
app: pod
ports:
- name: http
port: 80
---
apiVersion: policy.linkerd.io/v1alpha1
kind: Server
metadata:
name: srv
labels:
policy: srv
spec:
podSelector:
matchLabels:
app: pod
port: 80
proxyProtocol: HTTP/1
```
```bash
$ go run controller/script/destination-client/main.go -path svc.default.svc.cluster.local:80
```
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
In our chart values and (some) integration tests, we're using a deprecated
label for node selection. According to the warning messages we get during
installation, the label has been deprecated since k8s `v1.14`:
```
Warning: spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
Warning: spec.jobTemplate.spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
```
This PR changes all occurrences of `beta.kubernetes.io/node` with
`kubernetes.io/node`.
Fixes#7225
When the `mirror.linkerd.io/remote-gateway-identity` and `mirror.linkerd.io/remote-svc-fq-name` annotations are set on an EndpointSlice object, the destination controller does not return the correct identity hints for that endpoint.
We fix an incorrect assignment to fix this. We also fix some logic that can result in a nil pointer dereference instead of comparing empty strings. We add a test case to exercise these.
Signed-off-by: Alex Leong <alex@buoyant.io>
## background
In order to upgrade `client-go` and other related libraries to `v0.22.0`, we had to address the deprecation of the service's `TopologyKeys` field. This field and it's related feature have been deprecated and superseded by [Topology Aware Hints](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2433-topology-aware-hints/README.md).
The goal of topology aware hints is to to provide a simpler way for users to prefer endpoints by basing decisions soely off the node's `topology.kubernetes.io/zone` label. If a node is in `zone-a`, then it should prefer endpoints that _should_ be consumed by clients in `zone-a`.
kube-proxy (and now the destination controller) know that an endpoint _should_ be consumed by clients in certain zones if its `Hints.ForZones` field is set with a zone value that matches that of the client.
For example, the endpoint slice controller may add the following hint to an endpoint:
```
- addresses: ["1.1.1.1"]
zone: "zone-a"
hints:
zone: "zone-b"
```
The above endpoint is an endpoint that is located in `zone-a` but should be consumed by clients in `zone-b`.
## changes
Now that topological preference is not a concept, we can remove it from the `servicePublisher` and `portPublisher` structs. The fields were only there so that it could be populated down to individual addresses.
The `Hints` field is only present on endpoints that belong to an `EndpointSlice`, so use of this field is limited to the `endpointSliceToAddresses` function.
When endpoint slices are translated to an `AddressSet` now, for each address (endpoint) we make sure to copy the `Hints.ForZones` field if it is present. This field is only present if it's set by the endpoint slice controller and it has [several safeguards](https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/#safeguards).
After `endpointSliceToAddresses` has translated an endpoint slice into an `AddressSet` and updated the endpoint translator's `availableEndpoints`, filtering takes place and is the crux of this change.
For each potential address that we have to consider in `availableEndpoints`, we make sure to only return a set of addresses who's consumption zone (zones in `forZones` field) match that of the node's zone. That way, we only communicate with endpoints that have been labeled by the endpoint slice controller for the current node we're on.
This allows us to remove the ordering/hierarchy of topological region and considering the `*` value.
## testing
I've added a unit test which creates an endpoint translator tied to a node in `west-1a` and asserts that it only handles updates for addresses that should be consumed by clients in `west-1a`.
Closes#6637
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
Make destination info logs clearer by changing log level of watchers log messages 'Establishing watch', 'Starting watch' and 'Stopping watch' from info to debug (#6917)
Signed-off-by: Bart Peeters <birtpeeters@hotmail.com>
Fixes#5589
The core control plane has a dependency on the viz package in order to use the `BuildResource` function. This "backwards" dependency means that the viz source code needs to be included in core docker-builds and is bad for code hygiene.
We move the `BuildResource` function into the viz package. In `cli/cmd/metrics.go` we replace a call to `BuildResource` with a call directly to `CanonicalResourceNameFromFriendlyName`.
Signed-off-by: Alex Leong <alex@buoyant.io>
Closes#6253
### What
---
When we send a profile request with a pod IP, we get back an endpoint as part of the response. This has two advantages: we avoid building a load balancer and we can treat endpoint failure differently (with more of a fail fast approach). At the moment, when we use a pod DNS as the target of the profile lookup, we don't have an endpoint returned in the
response.
Through this change, the behaviour will be consistent. Whenever we look up a pod (either through IP or DNS name) we will get an endpoint back. The change also attempts to simplify some of the logic in GetProfile.
### How
---
We already have a way to build an endpoint and return it back to the client; I sought to re-use most of the code in an effort to also simplify `GetProfile()`. I extracted most of the code that would have been duplicated into a separate method that is responsible for building the address, looking at annotations for opaque ports and for sending the response back.
In addition, to support a pod DNS fqn I've expanded on the `else` branch of the topmost if statement -- if our host is not an IP, we parse the host to get the k8s fqn. If the parsing function returns an instance ID along with the ServiceID, then we know we are dealing directly with a pod -- if we do, we fetch the pod using the core informer and then return an endpoint for it.
### Tests
---
I've tested this mostly with the destination client script. For the tests, I used the following pods:
```
❯ kgp -n emojivoto -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
voting-ff4c54b8d-zbqc4 2/2 Running 0 3m58s 10.42.0.53 k3d-west-server-0 <none> <none>
web-0 2/2 Running 0 3m58s 10.42.0.55 k3d-west-server-0 <none> <none>
vote-bot-7d89964475-tfq7j 2/2 Running 0 3m58s 10.42.0.54 k3d-west-server-0 <none> <none>
emoji-79cc56f589-57tsh 2/2 Running 0 3m58s 10.42.0.52 k3d-west-server-0 <none> <none>
# emoji pod has an opaque port set to 8080.
# web-svc is a headless service and it backs a statefulset (which is why we have web-0).
# without a headless service we can't lookup based on pod DNS.
```
**`Responses before the change`**:
```
# request on IP, this is how things work at the moment. I included this because there shouldn't be
# any diff between the response given here and the response we get with the change.
# note: this corresponds to the emoji pod which has opaque ports set to 8080.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.52:8080
INFO[0000] opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524724} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"emoji"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"emoji-79cc56f589-57tsh"} metric_labels:{key:"pod_template_hash" value:"79cc56f589"} metric_labels:{key:"serviceaccount" value:"emoji"} tls_identity:{dns_like_identity:{name:"emoji.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{} opaque_transport:{inbound_port:4143}}}
INFO[0000]
# request web-0 by IP
# there shouldn't be any diff with the response we get after the change
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.55:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
# request web-0 by DNS name -- will not work.
❯ go run controller/script/destination-client/main.go -method getProfile -path web-0.web-svc.emojivoto.svc.cluster.loc
al:8080
INFO[0000] fully_qualified_name:"web-0.web-svc.emojivoto.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"web-svc.emojivoto.svc.cluster.local.:8080" weight:10000}
INFO[0000]
INFO[0000] fully_qualified_name:"web-0.web-svc.emojivoto.svc.cluster.local" retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"web-svc.emojivoto.svc.cluster.local.:8080" weight:10000}
INFO[0000]
# ^
# |
# --> no endpoint in the response
```
**`Responses after the change`**:
```
# request profile for emoji, we see opaque transport being set on the endpoint.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.52:8080
INFO[0000] opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524724} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"emoji"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"emoji-79cc56f589-57tsh"} metric_labels:{key:"pod_template_hash" value:"79cc56f589"} metric_labels:{key:"serviceaccount" value:"emoji"} tls_identity:{dns_like_identity:{name:"emoji.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{} opaque_transport:{inbound_port:4143}}}
INFO[0000]
# request profile for web-0 with IP.
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.55:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
# request profile for web-0 with pod DNS, resp contains endpoint.
❯ go run controller/script/destination-client/main.go -method getProfile -path web-0.web-svc.emojivoto.svc.cluster.local:8080
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:170524727} port:8080} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"namespace" value:"emojivoto"} metric_labels:{key:"pod" value:"web-0"} metric_labels:{key:"serviceaccount" value:"web"} metric_labels:{key:"statefulset" value:"web"} tls_identity:{dns_like_identity:{name:"web.emojivoto.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0000]
```
Signed-off-by: Matei David <matei@buoyant.io>
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).
The misspellings have been reported at 0d56327e6f (commitcomment-51603624)
The action reports that the changes in this PR would make it happy: 03a9c310aa
Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
This updates the destination to prefer `serviceprofiles.dstOverrides`
over `trafficsplits`. This is useful as it is important for
ServiceProfile to take preference over TrafficSplits when both are
present.
This also makes integration testing the `smi-adaptor` easier.
This also adds unit tests in the `traffic_split_adaptor` to check
for the same.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
While uncommon, if H2 upgrades are disabled it's possible for an opaque workload
to not have it's hint.OpaqueTransport field set in it's destination profile
response.
This changes the H2 upgrade enabled check to be specific for setting the
hint.Protocol while allowing hint.OpaqueTransport to be set independent of
that value.
Signed-off-by: Kevin Leimkuhler kevin@kleimkuhler.com
Co-authored-by: Oliver Gould <ver@buoyant.io>
* Support Traffic Splitting through `ServiceProfile.dstOverrides`
This commit
- updates the destination logic to prevent the override of `ServiceProfiles.dstOverrides` when a `TS` is absent and no `dstOverrides` set. This means that any `serviceprofiles.dstOverrides` set by the user are correctly propagated to the proxy and thus making Traffic Splitting through service profiles work.
- This also updates the `profile_translator.toDstOverrides` to use the port from `GetProfiles` when there is no port in `dstOverrides.authority`
After these changes, The following `ServiceProfile` to perform
traffic splitting should work.
```yaml
apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
name: backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local
spec:
dstOverrides:
- authority: "backend-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
- authority: "failing-svc.linkerd-trafficsplit-test-sp.svc.cluster.local.:8080"
weight: 500m
```
This PR also adds an integration test i.e `TestTrafficSplitCliWithSP` which checks for
Traffic Splitting through Service Profiles. This integration test follows the same pattern
of other traffic split integration tests but Instead, we perform `linkerd viz stat` on the
client and check if there are expected server objects to be there as `linkerd viz stat ts`
does not _yet_ work with Service Profiles.
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* destination: Remove support for IP Queries in `Get` API
Fixes#5246
This PR updates the destination to report an error when `Get`
is called for IP Queries. As the issue mentions, The proxies
are not using this API anymore and it helps to simplify and
remove unnecessary logic.
This removes the relevant `IPWatcher` logic, along with
unit tests
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
* Remove the `linkerd-controller` pod
Now that we got rid of the `Version` API (#6000) and the destination API forwarding business in `linkerd-controller` (#5993), we can get rid of the `linkerd-controller` pod.
## Removals
- Deleted everything under `/controller/api/public` and `/controller/cmd/public-api`.
- Moved `/controller/api/public/test_helper.go` to `/controller/api/destination/test_helper.go` because those are really utils for destination testing. I also extracted from there the prometheus mock structs and put that under `/pkg/prometheus/test_helper.go`, which is now by both the `linkerd diagnostics endpoints` and the `metrics-api` tests, removing some duplication.
- Deleted the `controller.yaml` and `controller-rbac.yaml` helm templates along with the `publicAPIResources` and `publicAPIProxyResources` helm values.
## Health checks
- Removed the `can initialize the client` check given such client is no longer needed. The `linkerd-api` section was left with only the check `control pods are ready`, so I moved that under the `linkerd-existence` section and got rid of the `linkerd-api` section altogether.
- In that same `linkerd-existence` section, got rid of the `controller pod is running` check.
## Other changes
- Fixed the Control Plane section of the dashboard, taking account the disappearance of `linkerd-controller` and previously, of `linkerd-sp-validator`.
When a pod is configured with `skip-inbound-ports` annotation, a client
proxy trying to connect to that pod tries to connect to it via H2 and
also tries to initiate a TLS connection. This issue is caused by the
destination controller when it sends protocol and TLS hints to the
client proxy for that skipped port.
This change fixes the destination controller so that it no longer
sends protocol and TLS identity hints to outbound proxies resolving a
`podIP:port` that is on a skipped inbound port.
I've included a test that exhibits this error prior to this fix but you
can also test the prior behavior by:
```bash
curl https://run.linkerd.io/booksapp.yml > booksapp.yaml
# edit either the books or authors service to:
1: Configure a failure rate of 0.0
2: add the `skip-inbound-ports` config annotation
bin/linkerd viz stat pods webapp
There should be no successful requests on the webapp deployment
```
Fixes#5995
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
* Removed `Version` API from the public-api
This is a sibling PR to #5993, and it's the second step towards removing the `linkerd-controller` pod.
This one deals with a replacement for the `Version` API, fetching instead the `linkerd-config` CM and retrieving the `LinkerdVersion` value.
## Changes to the public-api
- Removal of the `publicPb.ApiClient` entry from the `Client` interface
- Removal of the `publicPb.ApiServer` entry from the `Server` interface
- Removal of the `Version` and related methods from `client.go`, `grpc_server.go` and `http_server.go`
## Changes to `linkerd version`
- Removal of all references to the public API.
- Call `healthcheck.GetServerVersion` to retrieve the version
## Changes to `linkerd check`
- Removal of the "can query the control API" check from the "linkerd-api" section
- Addition of a new "can retrieve the control plane version" check under the "control-plane-version" section
## Changes to `linkerd-web`
- The version is now retrieved from the `linkerd-config` CM instead of a public-API call.
- Removal of all references to the public API.
- Removal of the `data-go-version` global attribute on the dashboard, which wasn't being used.
## Other changes
- Added `ValuesFromConfigMap` function in `values.go` to convert the `linkerd-config` CM into a `*Values` struct instance
- Removal of the `public` protobuf
- Refactor 'linkerd repair' to use the refactored 'healthcheck.GetServerVersion()' function
* Removed Destination's `Get` API from the public-api
This is the first step towards removing the `linkerd-controller` pod. It deals with removing the Destination `Get` http and gRPC endpoint it exposes, that only the `linkerd diagnostics endpoints` is consuming.
Removed all references to Destination in the public-api, including all the gRPC-to-http-to-gRPC forwardings:
- Removed the `Get` method from the public-api gRPC server that forwarded the connection from the controller pod to the destination pod. Clients should now connect directly to the destination service via gRPC.
- Likewise, removed the destination boilerplate in the public-api http server (and its `main.go`) that served the `Get` destination endpoint and forwarded it into the gRPC server.
- Finally, removed the destination boilerplate from the public-api's `client.go` that created a client connecting to the http API.
Fixes#5939
Some CNIs reasssign the IP of a terminating pod to a new pod, which
leads to duplicate IPs in the cluster.
It eventually triggers #5939.
This commit will make the IPWatcher, when given an IP, filter out the terminating pods
(when a pod is given a deletionTimestamp).
The issue is hard reproduce because we are not able to assign a
particular IP to a pod manually.
Signed-off-by: Bruce <wenliang.chen@personio.de>
Co-authored-by: Bruce <wenliang.chen@personio.de>
This fixes an issue where pod lookups by host IP and host port fail even though
the cluster has a matching pod.
Usually these manifested as `FailedPrecondition` errors, but the messages were
too long and resulted in http/2 errors. This change depends on #5893 which fixes
that separate issue.
This changes how often those `FailedPrecondition` errors actually occur. The
destination service now considers pod host IPs and should reduce the frequency
of those errors.
Closes#5881
---
Lookups like this happen when a pod is created with a host IP and host port set
in its spec. It still has a pod IP when running, but requests to
`hostIP:hostPort` will also be redirected to the pod. Combinations of host IP
and host Port are unique to the cluster and enforced by Kubernetes.
Currently, the destination services fails to find pods in this scenario because
we only keep an index with pod and their pod IPs, not pods and their host IPs.
To fix this, we now also keep an index of pods and their host IPs—if and only if
they have the host IP set.
Now when doing a pod lookup, we consider both the IP and the port. We perform
the following steps:
1. Do a lookup by IP in the pod podIP index
- If only one pod is found then return it
2. 0 or more than 1 pods have the same pod IP
3. Do a lookup by IP in the pod hostIP index
- If any number of pods were found, we know that IP maps to a node IP.
Therefore, we search for a pod with a matching host Port. If one exists then
return it; if not then there is no pod that matches `hostIP:port`
4. The IP does not map to a host IP
5. If multiple pods were found in `1`, then we know there are pods with
conflicting podIPs and an error is returned
6. If no pounds were found in `1` then there is no pod that matches `IP:port`
---
Aside from the additional IP watcher test being added, this can be tested with
the following steps:
1. Create a kind cluster. kind is required because it's pods in `kube-system`
have the same pod IPs; this not the case with k3d: `bin/kind create cluster`
2. Install Linkerd with `4445` marked as opaque: `linkerd install --set
proxy.opaquePorts="4445" |kubectl apply -f -`
2. Get the node IP: `kubectl get -o wide nodes`
3. Pull my fork of `tcp-echo`:
```
$ git clone https://github.com/kleimkuhler/tcp-echo
...
$ git checkout --track kleimkuhler/host-pod-repro
```
5. `helm package .`
7. Install `tcp-echo` with the server not injected and correct host IP: `helm
install tcp-echo tcp-echo-0.1.0.tgz --set server.linkerdInject="false" --set
hostIP="..."`
8. Looking at the client's proxy logs, you should not observe any errors or
protocol detection timeouts.
9. Looking at the server logs, you should see all the requests coming through
correctly.
Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>