Commit Graph

87 Commits

Author SHA1 Message Date
Alejandro Pedraza 69ecb7f10d
Replace `server_port_subscribers` metric (#11206)
Fixes #10764

Removed the `server_port_subscribers` gauge, as it wasn't distiguishing
amongst different pods, and the number of subscribers for each pod were
conflicting with one another when updating the metric (see more details
[here](https://github.com/linkerd/linkerd2/issues/10764#issuecomment-1650835823)).

Besides carying an invalid value, this was generating the warning
`unable to delete server_port_subscribers metric with labels`

The metric was replaced with the `server_port_subscribes` and
`server_port_unsubscribes` counters, which track the overall number of
subscribes and unsubscribes to the particular pod port.

🌮 to @adleong for the diagnosis and the fix!
2023-08-08 11:28:51 -05:00
Matei David c2780adde1
Introduce an EndpointsWatcher cache structure (#11190)
Currently, the control plane does not support indexing and discovering resources across cluster boundaries. In a multicluster set-up, it is desirable to have access to endpoint data (and by extension, any configuration associated with that endpoint) to support pod-to-pod communication. Linkerd's destination service should be extended to support reading discovery information from multiple sources (i.e. API Servers).

This change introduces an `EndpointsWatcher` cache. On creation, the cache registers a pair of event handlers:
* One handler to read `multicluster` secrets that embed kubeconfig files. For each such secret, the cache creates a corresponding `EndpointsWatcher` to read remote discovery information.
* Another handle to evict entries from the cache when a `multicluster` secret is deleted.

Additionally, a set of tests have been added that assert the behaviour of the cache when secrets are created and/or deleted. The cache will be used by the destination service to do look-ups for services that belong to another cluster, and that are running in a "remote discovery" mode.

---------

Signed-off-by: Matei David <matei@buoyant.io>
2023-08-04 13:59:21 -07:00
Alex Leong f7f1a95081
Add mutex lock to updateServer (#11169)
Fixes #11163

The `servicePublisher.updateServer` function will iterate through all registered listeners and update them.  However, a nil listener may temporarily be in the list of listeners if an unsubscribe is in progress.  This results in a nil pointer dereference.

All functions which result in updating the listeners must therefore be protected by the mutex so that we don't try to act on the list of listeners while it is being modified.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-07-27 11:17:53 -07:00
Mark Robinson 21209955c2
Fix bug where topology routing would not disable while service was under load. (#10925)
Add support for enabling and disabling topology aware routing when hints are added/removed.

The testing setup is very involved because it involves so many moving parts

1) Setup a service which is layered over several availability zones.
1a) The best way to do this is one service object, with 3 replicasets explicitly forced to use a specific AZ each.
2) Add `service.kubernetes.io/topology-aware-hints: Auto` annotation to the Service object
3) Use a load tester like k6 to send meaningful traffic to your service but only in one AZ
3) Scale up your replica sets until k8s adds Hints to your endpointslices
4) Observe that traffic shifts to only hit pods in one AZ

5) Turn down the replicasets count until such time that K8s removes the hints from your endpointslices
6) Observe traffic shifts back to all pods across all AZ.
2023-05-26 10:31:14 -07:00
Alejandro Pedraza 33adae2b7c
Fix `server_port_subscribers` metric in Destination (#10820)
Fixes #10764

`GetProfile` streams create a `server_port_subscribers` gauge that tracks the number of listeners interested in a given Server. Because of an oversight, the gauge was only being registered until the second listener was added. For just one listener the gauge was absent. But whenever the `GetProfile` stream ended, the gauge was deleted which resulted in this error if it wasn't registered to begin with:

```
level=warning msg="unable to delete server_port_subscribers metric with labels map[name:voting namespace:emojivoto port:4191]" addr=":8086" component=server
```

One can check that the gauge wasn't being created by installing viz and emojivoto, and checking the following returns empty:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers
```

After this fix, one can see the metric getting populated:

```bash
$ linkerd diagnostics controller-metrics | grep server_port_subscribers

# HELP server_port_subscribers Number of subscribers to Server changes associated with a pod's port.
# TYPE server_port_subscribers gauge
server_port_subscribers{name="emoji",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="4191"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9990"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9995"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9996"} 1
server_port_subscribers{name="linkerd",namespace="linkerd",port="9997"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="metrics",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9995"} 1
server_port_subscribers{name="tap",namespace="linkerd-viz",port="9998"} 1
server_port_subscribers{name="vote",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="voting",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="emojivoto",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="4191"} 1
server_port_subscribers{name="web",namespace="linkerd-viz",port="9994"} 1
```

And when scaling down the voting deployment, one can see how the metric with `name="voting"` is removed.
2023-04-26 16:48:50 -05:00
Oliver Gould 08f97cc26f
destination: Avoid sending spurious profile updates (#10517)
The GetProfile API endpoint does not behave as expected: when a profile
watch is established, the API server starts two separate profile
watches--a primary watch with the client's namespace and fallback watch
ignoring the client's namespace. These watches race to send data back to
the client. If the backup watch updates first, it may be sent to clients
before being corrected by a subsequent update. If the primary watch
updates with an empty value, the default profile may be served before
being corrected by an update to the backup watch.

From the proxy's perspective, we'd much prefer that the API provide a
single authoritative response when possible. It avoids needless
corrective work from distributing across the system on every watch
initiation.

To fix this, we modify the fallbackProfileListener to behave
predictably: it only emits updates once both its primary and fallback
listeners have been updated. This avoids emitting updates based on a
partial understanding of the cluster state.

Furthermore, the opaquePortsAdaptor is updated to avoid synthesizing a
default serviceprofile (surprising behavior) and, instead, this
defaulting logic is moved into a dedicated defaultProfileListener
helper. A dedupProfileListener is added to squelch obviously redundant
updates.

Finally, this newfound predictability allows us to simplify the API's
tests. Many of the API tests are not clear in what they test and
sometimes make assertions about the "incorrect" profile updates.
2023-03-13 13:36:18 -07:00
Oliver Gould c657aeabe8
destination: Split GetProfile into smaller functions (#10514)
Before changing any GetProfile behavior, this change splits the API
handler into some smaller scopes. This helps to clarify control flow and
reduce nested contexts. This change also adds relevant fields to log
contexts to improve diagnostics.
2023-03-13 11:36:39 -07:00
Yannick Utard 95a3d19b3f
Fix deleteEndpointSlice when there are multiple EndpointSlices (#10370)
When processing a `delete` event for an EndpointSlice, regardless of the outcome (whether there are still addresses/endpoints alive or whether we have no endpoints left) we make a call to `noEndpoints` on the subscriber. This will send a `Destination.Get` update to the listeners to advertise a `NoEndpoints` event.

If there are still addresses left, NoEndpoints { exists: true } will be sent (to signal the service still exists). Until an update is registered (i.e a new add) there will be no available endpoints -- this is incorrect, since other EndpointSlices may exist. This change fixes the problem by handling `noEndpoints` in a more specialized way for EndpointSlices.

Signed-off-by: Yannick Utard <yannickutard@gmail.com>
2023-02-22 13:35:02 +00:00
Alejandro Pedraza 4a84f2cb32
Implement the k8s metadata API in the Destination controller (#10326)
Fixes #9986

After reviewing the k8s API calls in Destination, it was concluded we
could only swap out the calls to the Node and RS resources to use the
metadata API, as all the other resources (Endpoints, EndpointSlices,
Services, Pod, ServiceProfiles, Server) required fields other than those
found in their metadata section.

This also required completing the `NewFakeAPI` implementation by adding
the missing annotations and labels entries.

## Testing Memory Consumption

The gains here aren't as big as in #9650. In order to test this we need
to push hard and create 4000 RS:

``` bash
for i in {0..4000}; do kubectl create deployment test-pod-$i --image=nginx; done
```

In edge-23.2.1 the destination pod's memory consumption goes from 40Mi
to 160Mi after all the RS were created. With this change, it went from
37Mi to 140Mi.
2023-02-13 17:30:07 -05:00
Alex Leong 6ebcf19cfc
Don't use pods in map keys in server watcher (#10245)
Fixes #10205

The server watches uses `*corev1.Pod` in the key of it's subscriber map which is potentially brittle since it's not obvious what exactly should count as equality for full pod resources.

More robust is to use a unique per-pod identifier in the key: the pod's name and namespace.  We therefore more the pod resource into the value type of the map since we still need it when we receive a Server update so that we can iterate through all subscriptions and find pods which match the Server's pod selector. 

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-02-13 13:41:00 -08:00
Alejandro Pedraza 147c8dc07c
Add metrics to server and service watchers (#10213)
* Add metrics to server and service watchers

Closes #10202 and completes #2204

As a followup to #10201, I'm adding the following metric in `server_watcher.go`:

- `server_port_subscribers`: This tracks the number of subscribers to changes to Servers associated to a port in a pod. The metric's label identify the namespace and name of the pod, and its targeted port.

Additionally, `opaque_ports.go` was missing metrics as well. I added `service_subscribers` which tracks the number of subscribers to a given Service, labeled by the Service's namespace and name.

`opaque_ports.go` was also leaking the subscriber's map key, so that got fixed as well.
2023-02-07 08:51:09 -05:00
Yu Cao e662e147ca
Support service internal traffic policy (#10186)
Closes #10130

https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/
1. Update endpoints watcher to include additional field `localTrafficPolicy`.
   Set to true when `.spec.internalTrafficPolicy` is set to `Local`
2. Update endpoints translater to filter by node when `localTrafficPolicy` is
   set to true. Topology Aware Hints are not used when
   `service.pec.internalTrafficPolicy` is set to local

Signed-off-by: Yu Cao <yc185050@ncr.com>
2023-02-06 13:53:07 -07:00
Alejandro Pedraza 8b29408db3
Fix memory leak in Server listeners (#10201)
Fixes #8270

When a listener unsubscribes to port updates in Servers, we were
removing the listener for the `ServerWatcher.subscriptions` map, leaving
the map's key (`podPort` with holds the pod object and port) with an
empty value. In clusters where there's a lot of pod churn, those keys
with empty values were getting accumulated, so this change cleans that
up.

The repro (basically constantly rolling emojivoto) is described in
#9947.

A followup will be up shortly adding metrics to track these metrics,
along with similar missing metrics from other parts of Destination.
2023-01-26 12:20:48 -05:00
dependabot[bot] 62d6d7cd52
build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0 (#10038)
* build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0

Bumps [sigs.k8s.io/gateway-api](https://github.com/kubernetes-sigs/gateway-api) from 0.5.1 to 0.6.0.
- [Release notes](https://github.com/kubernetes-sigs/gateway-api/releases)
- [Changelog](https://github.com/kubernetes-sigs/gateway-api/blob/main/CHANGELOG.md)
- [Commits](https://github.com/kubernetes-sigs/gateway-api/compare/v0.5.1...v0.6.0)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/gateway-api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Account for possible errors returned from `AddEventHandler`

In v0.26.0 client-go's `AddEventHandler` method for informers started
returning a registration handle (that we ignore) and an error that we
now surface up.

* client-go v0.26.0 removed the openstack plugin

* Temporary changes to trigger tests in k8s 1.21

- Adds an innocuous change to integration.yml so that all tests get
  triggered
- Hard-code k8s version in `k3d cluster create` invocation to v1.21

* Revert "Temporary changes to trigger tests in k8s 1.21"

This reverts commit 3e1fdd0e5e.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2023-01-16 09:38:09 -05:00
Alex Leong 768e04dd7e
Update endpoints watcher to not fetch pods for removed endpoints (#10013)
Fixes #10003

When endpoints are removed from an EndpointSlice resource, the destination controller builds a list of addresses to remove.  However, if any of the removed endpoints have a Pod as their targetRef, we will attempt to fetch that pod to build the address to remove.  If that pod has already been removed from the informer cache, this will fail and the endpoint will be skipped in the list of endpoints to be removed.  This results in stale endpoints being stuck in the address set and never being removed.

We update the endpoint watcher to construct only a list of endpoint IDs for endpoints to remove, rather than fetching the entire pod object.  Since we no longer attempt to fetch the pod, this operation is now infallible and endpoints will no longer be skipped during removal.

We also add a `TestEndpointSliceScaleDown` test to exercise this.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-01-03 10:04:02 -08:00
Oliver Gould b2f22dee78
go: Copy port range utilities from the proxy-init repo (#9143)
The proxy-init repo is changing its structure and, as such, we want to
minimize cross-repo dependencies from linkerd2 to linkerd2-proxy-init.
(We expect the cni-plugin code to move in a followup change).

This change duplicates the port range parsing utility (about 50 lines,
plus tests). This avoids stray dependencies on linkerd2-proxy-init.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-08-12 10:34:02 -07:00
Jacob Henner 7d47639608
Remove kube-system exclusions from watchers (#8720)
Watch events for objects in the kube-system namespace were previously ignored.
In certain situations, this would cause the destination service to return
invalid (outdated) endpoints for services in kube-system - including unmeshed
services.

It [was suggested][1] that kube-system events were ignored to avoid handling
frequent Endpoint updates - specifically from [controllers using Endpoints for
leader elections][2]. As of Kubernetes 1.20, these controllers [default to using
Leases instead of Endpoints for their leader elections][3], obviating the need
to exclude (or filter) updates from kube-system. The exclusions have been
removed accordingly.

[1]: https://github.com/linkerd/linkerd2/pull/4133#issuecomment-594983588
[2]: https://github.com/kubernetes/kubernetes/issues/86286
[3]: https://github.com/kubernetes/kubernetes/pull/94603

Signed-off-by: Jacob Henner <code@ventricle.us>
2022-07-11 13:52:27 -06:00
Alex Leong b7a0b8adb4
Bump minimum kubernetes version to 1.21 (#8647)
Fixes #8592

Increase the minimum supported kubernetes version from 1.20 to 1.21.  This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources.  This fixes deprecation warnings about these warnings printed by the CLI.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-14 15:15:28 -07:00
Alex Leong c5963fbbb1
Set targetCluster label even when serviceFqn is not set (#8542)
Fixes #8134

The multicluster probe-gateway service (which is used by the service-mirror controller to send health probes to the remote gateway) has `mirror.linkerd.io/cluster-name` label but it does not have a `mirror.linkerd.io/remote-svc-fq-name` annotation.  This makes sense because the probe-gateway service does not correspond to any target service on the remote cluster and instead targets the gateway itself.

We update the logic in the destination controller so that we can still add the `dst_target_cluster` metric label when only the `mirror.linkerd.io/cluster-name` label is present but not the `mirror.linkerd.io/remote-svc-fq-name` annotation.  This allows us to have the `dst_target_cluster` metric label for service-mirror controller probe traffic.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-05-31 16:09:51 -07:00
Oliver Gould fa8ddb4801
Use go-test/deep for comparisons in tests (#8427)
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-05 09:31:07 -07:00
Matei David 1e9f734bcd
Support whitespaces in opaqueports annotation (#8355)
Opaque ports may be configured through annotations, where the value may
be either as a range (e.g 0 - 255), or as a comma delimited string
("121,122"). When configured as a comma delimited string, our parsing
logic will trim any leading and trailing commas and split the value;
ports will be converted from string to an int counterpart.

If whitespaces are used, such that the value looks similar to "121,
122", the parsing logic will fail -- when attempting to convert the
string into an integer -- with the following error "\" 122\" is not a
valid lower-bound". This can lead to confusion from users whose services
and endpoints function with undefined behaviour.

This change introduces logic to strip any leading or trailing
whitespaces from strings when tokenizing the annotation value; this way,
we are guaranteed not to experience parsing errors when converting
strings. To validate the behaviour, a new (unit) test has been added to
the opaque ports watcher with a multi-opaque-port annotation on a
service, where the value contains a space.

Signed-off-by: Matei David <matei@buoyant.io>
2022-04-28 13:46:27 +03:00
Alejandro Pedraza 30e42f98f6
Fix race in destination unit test (#8065)
Fixes report https://github.com/linkerd/linkerd2/runs/5518386921
by guarding `BufferingProfileListener` with a mutex.
2022-03-14 11:26:05 -07:00
Kevin Leimkuhler 67bcd8f642
Add `gosec` and `errcheck` lints (#7954)
Closes #7826

This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.

A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.

Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.

Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-03 10:09:51 -07:00
Oliver Gould 425a43def5
Enable gocritic linting (#7906)
[gocritic][gc] helps to enforce some consistency and check for potential
errors. This change applies linting changes and enables gocritic via
golangci-lint.

[gc]: https://github.com/go-critic/go-critic

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-17 22:45:25 +00:00
Oliver Gould f5876c2a98
go: Enable `errorlint` checking (#7885)
Since Go 1.13, errors may "wrap" other errors. [`errorlint`][el] checks
that error formatting and inspection is wrapping-aware.

This change enables `errorlint` in golangci-lint and updates all error
handling code to pass the lint. Some comparisons in tests have been left
unchanged (using `//nolint:errorlint` comments).

[el]: https://github.com/polyfloyd/go-errorlint

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-16 18:32:19 -07:00
Oliver Gould 7776a50074
destination: Use net.JoinHostPort to format names (#7821)
We use `fmt.Sprintf` to format URIs in several places we could be using
`net.JoinHostPort`. `net.JoinHostPort` ensures that IPv6 addresses are
properly escaped in URIs.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-02-07 15:05:14 -08:00
Kevin Leimkuhler 147d85dc70
Update `GetProfile` clients with policy server updates (#7388)
### What

`GetProfile` clients do not receive destinatin profiles that consider Server protocol fields the way that `Get` clients do. If a Server exists for a `GetProfile` destination that specifies the protocol for that destination is `opaque`, this information is not passed back to the client.

#7184 added this for `Get` by subscribing clients to Endpoint/EndpointSlice updates. When there is an update, or there is a Server update, the endpoints watcher passes this information back to the endpoint translator which handles sending the update back to the client.

For `GetProfile` the situation is different. As with `Get`, we only consider Servers when dealing with Pod IPs, but this only occurs in two situations for `GetProfile`.

1. The destination is a Pod IP and port
2. The destionation is an Instance ID and port

In both of these cases, we need to check if a already Server selects the endpoint and we need to subscribe for Server updates incase one is added or deleted which selects the endpoint.

### How

First we check if there is already a Server which selects the endpoint. This is so that when the first destionation profile is returned, the client knows if the destination is `opaque` or not.

After sending that first update, we then subscribe the client for any future updates which will come from a Server being added or deleted.

This is handled by the new `ServerWatcher` which watches for Server updates on the cluster; when an update occurs it sends that to the `endpointProfileTranslator` which translates the protcol update into a DestinationProfile.

By introducing the `endpointProfileTranslator` which only handles protocol updates, we're able to decouple the endpoint logic from `profileTranslator`—it's `endpoint` field has been removed now that it only handles updates for ServiceProfiles for Services.

### Testing

A unit test has been added and below are some manual testing instructions to see how it interacts with Server updates:

<details>
	<summary>app.yaml</summary>

	```yaml
	apiVersion: v1
	kind: Pod
	metadata:
	  name: pod
	  labels:
		app: pod
	spec:
	  containers:
	  - name: app
		image: nginx
		ports:
		  - name: http
			containerPort: 80
	---
	apiVersion: policy.linkerd.io/v1beta1
	kind: Server
	metadata:
	  name: srv
	  labels:
		policy: srv
	spec:
	  podSelector:
		matchLabels:
		  app: pod
	  port: 80
	  proxyProtocol: opaque
	```
</details>

```shell
$ go run ./controller/cmd/main.go destination
```

```shell
$ linkerd inject app.yaml |kubectl apply -f -
...
$ kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP           NODE                       NOMINATED NODE   READINESS GATES
pod    2/2     Running   0          53m   10.42.0.34   k3d-k3s-default-server-0   <none>           <none>
$ go run ./controller/script/destination-client/main.go -method getProfile -path 10.42.0.34:80
...
```

You can add/delete `srv` as well as edit its `proxyProtocol` field to observe the correct DestinationProfile updates.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2021-12-08 12:26:27 -07:00
Tarun Pothulapati 4170b49b33
smi: remove default functionality in linkerd (#7334)
Now, that SMI functionality is fully being moved into the
[linkerd-smi](www.github.com/linkerd/linkerd-smi) extension, we can
stop supporting its functionality by default.

This means that the `destination` component will stop reacting
to the `TrafficSplit` objects. When `linkerd-smi` is installed,
It does the conversion of `TrafficSplit` objects to `ServiceProfiles`
that destination components can understand, and will react accordingly.

Also, Whenever a `ServiceProfile` with traffic splitting is associated
with a service, the same information (i.e splits and weights) is also
surfaced through the `UI` (in the new `services` tab) and the `viz cmd`.
So, We are not really loosing any UI functionality here.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-12-03 12:07:30 +05:30
Krzysztof Dryś f92e77f7f0
Remove legacy upgrade and it's references (#7309)
With [linkerd2#5008](https://github.com/linkerd/linkerd2/issues/5008) and associated PRs, we changed the way configuration is handled by storing a helm values struct inside of the configmap.

Now that we have had one stable release with new configuration, were no longer use and need to maintain the legacy config. This commit removes all the associated logic, protobuf files, and references.

Changes Include:

- Removed [`proto/config/config.proto`](https://github.com/linkerd/linkerd2/blob/main/proto/config/config.proto)
- Changed [`bin/protoc-go.sh`](https://github.com/linkerd/linkerd2/blob/main/bin/protoc-go.sh) to not include `config.proto`
- Changed [`FetchLinkerdConfigMap()`](741fde679b/pkg/healthcheck/healthcheck.go (L1768)) in `healthcheck.go` to return only the configmap, with the pb type.
- Changed [`FetchCurrentConfiguration()`](741fde679b/pkg/healthcheck/healthcheck.go (L1647)) only unmarshal and use helm value struct from configmap (as a follow-up to the todo above; note that there's already a todo here to refactor the function once value struct is the default, which has already happened)
- Removed [`upgrade_legacy.go`](https://github.com/linkerd/linkerd2/blob/main/cli/cmd/upgrade_legacy.go)

Signed-off-by: Krzysztof Dryś <krzysztofdrys@gmail.com>
2021-11-29 20:08:58 +05:30
Kevin Leimkuhler 01cbe616f1
Honor Server `proxyProtocol` in destination service `Get` with policy CRD APIs (#7184)
This change ensures that if a Server exists with `proxyProtocol: opaque` that selects an endpoint backed by a pod, that destination requests for that pod reflect the fact that it handles opaque traffic.

Currently, the only way that opaque traffic is honored in the destination service is if the pod has the `config.linkerd.io/opaque-ports` annotation. With the introduction of Servers though, users can set `server.Spec.ProxyProtocol: opaque` to indicate that if a Server selects a pod, then traffic to that pod's `server.Spec.Port` should be opaque. Currently, the destination service does not take this into account.

There is an existing change up that _also_ adds this functionality; it takes a different approach by creating a policy server client for each endpoint that a destination has. For `Get` requests on a service, the number of clients scales with the number of endpoints that back that service.

This change fixes that issue by instead creating a Server watch in the endpoint watcher and sending updates through to the endpoint translator.

The two primary scenarios to consider are

### A `Get` request for some service is streaming when a Server is created/updated/deleted
When a Server is created or updated, the endpoint watcher iterates through its endpoint watches (`servicePublisher` -> `portPublisher`) and if it selects any of those endpoints, the port publisher sends an update if the Server has marked that port as opaque.

When a Server is deleted, the endpoint watcher once again iterates through its endpoint watches and deletes the address set's `OpaquePodPorts` field—ensuring that updates have been cleared of Server overrides.

### A `Get` request for some service happens after a Server is created
When a `Get` request occurs (or new endpoints are added—they both take the same path), we must check if any of those endpoints are selected by some existing Server. If so, we have to take that into account when creating the address set.

This part of the change gives me a little concern as we first must get all the Servers on the cluster and then create a set of _all_ the pod-backed endpoints that they select in order to determine if any of these _new_ endpoints are selected.

## Testing
Right now this can be tested by starting up the destination service locally and running `Get` requests on a service that has endpoints selected by a Server

**app.yaml**
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod
  labels:
    app: pod
spec:
  containers:
  - name: app
    image: nginx
    ports:
    - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: svc
spec:
  selector:
    app: pod
  ports:
  - name: http
    port: 80
---
apiVersion: policy.linkerd.io/v1alpha1
kind: Server
metadata:
  name: srv
  labels:
    policy: srv
spec:
  podSelector:
    matchLabels:
      app: pod
  port: 80
  proxyProtocol: HTTP/1
```

```bash
$ go run controller/script/destination-client/main.go -path svc.default.svc.cluster.local:80
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-11-23 20:35:53 -07:00
Alex Leong ea5461f674
Fix identity overrides for endpoint slices (#7243)
When the `mirror.linkerd.io/remote-gateway-identity` and `mirror.linkerd.io/remote-svc-fq-name` annotations are set on an EndpointSlice object, the destination controller does not return the correct identity hints for that endpoint.

We fix an incorrect assignment to fix this.  We also fix some logic that can result in a nil pointer dereference instead of comparing empty strings.  We add a test case to exercise these.

Signed-off-by: Alex Leong <alex@buoyant.io>
2021-11-09 16:38:46 -08:00
Kevin Leimkuhler ebb1ee8c4c
Deprecate `topologyKeys` and add support for endpoint slices `Hints`. (#6698)
## background

In order to upgrade `client-go` and other related libraries to `v0.22.0`, we had to address the deprecation of the service's `TopologyKeys` field. This field and it's related feature have been deprecated and superseded by [Topology Aware Hints](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2433-topology-aware-hints/README.md).

The goal of topology aware hints is to to provide a simpler way for users to prefer endpoints by basing decisions soely off the node's `topology.kubernetes.io/zone` label. If a node is in `zone-a`, then it should prefer endpoints that _should_ be consumed by clients in `zone-a`.

kube-proxy (and now the destination controller) know that an endpoint _should_ be consumed by clients in certain zones if its `Hints.ForZones` field is set with a zone value that matches that of the client.

For example, the endpoint slice controller may add the following hint to an endpoint:
```
- addresses: ["1.1.1.1"]
  zone: "zone-a"
  hints:
    zone: "zone-b"
```

The above endpoint is an endpoint that is located in `zone-a` but should be consumed by clients in `zone-b`.

## changes

Now that topological preference is not a concept, we can remove it from the `servicePublisher` and `portPublisher` structs. The fields were only there so that it could be populated down to individual addresses.

The `Hints` field is only present on endpoints that belong to an `EndpointSlice`, so use of this field is limited to the `endpointSliceToAddresses` function.

When endpoint slices are translated to an `AddressSet` now, for each address (endpoint) we make sure to copy the `Hints.ForZones` field if it is present. This field is only present if it's set by the endpoint slice controller and it has [several safeguards](https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/#safeguards).

After `endpointSliceToAddresses` has translated an endpoint slice into an `AddressSet` and updated the endpoint translator's `availableEndpoints`, filtering takes place and is the crux of this change.

For each potential address that we have to consider in `availableEndpoints`, we make sure to only return a set of addresses who's consumption zone (zones in `forZones` field) match that of the node's zone. That way, we only communicate with endpoints that have been labeled by the endpoint slice controller for the current node we're on.

This allows us to remove the ordering/hierarchy of topological region and considering the `*` value.

## testing

I've added a unit test which creates an endpoint translator tied to a node in `west-1a` and asserts that it only handles updates for addresses that should be consumed by clients in `west-1a`.

Closes #6637

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-11-08 12:21:31 -07:00
Bart Peeters 6bf507ad32
Change log level for watchers in destination svc
Make destination info logs clearer by changing log level of watchers log messages 'Establishing watch', 'Starting watch' and 'Stopping watch' from info to debug (#6917)

Signed-off-by: Bart Peeters <birtpeeters@hotmail.com>
2021-09-20 09:44:32 +01:00
LiuDui 61d4fa80fc
remove unused constants (#6630)
Signed-off-by: liudui <1693291525@qq.com>
2021-08-09 16:23:48 +01:00
Josh Soref 0be792fadc
Spelling (#6215)
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).

The misspellings have been reported at 0d56327e6f (commitcomment-51603624)

The action reports that the changes in this PR would make it happy: 03a9c310aa

Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2021-06-07 15:16:59 -06:00
Tarun Pothulapati fac28ff8a7
destination: Remove support for IP Queries in `Get` API (#6018)
* destination: Remove support for IP Queries in `Get` API

Fixes #5246

This PR updates the destination to report an error when `Get`
is called for IP Queries. As the issue mentions, The proxies
are not using this API anymore and it helps to simplify and
remove unnecessary logic.

This removes the relevant `IPWatcher` logic, along with
unit tests

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-04-21 12:40:40 +05:30
Bruce Chen Wenliang b84b2077d3
Ignore pods in "Terminating" when watching IP addresses. (#5940)
Fixes #5939

Some CNIs reasssign the IP of a terminating pod to a new pod, which
leads to duplicate IPs in the cluster.

It eventually triggers #5939.

This commit will make the IPWatcher, when given an IP, filter out the terminating pods
(when a pod is given a deletionTimestamp).

The issue is hard reproduce because we are not able to assign a
particular IP to a pod manually.

Signed-off-by: Bruce <wenliang.chen@personio.de>

Co-authored-by: Bruce <wenliang.chen@personio.de>
2021-03-24 18:21:42 +05:30
Kevin Leimkuhler 3f72c998b3
Handle pod lookups for pods that map to a host IP and host port (#5904)
This fixes an issue where pod lookups by host IP and host port fail even though
the cluster has a matching pod.

Usually these manifested as `FailedPrecondition` errors, but the messages were
too long and resulted in http/2 errors. This change depends on #5893 which fixes
that separate issue.

This changes how often those `FailedPrecondition` errors actually occur. The
destination service now considers pod host IPs and should reduce the frequency
of those errors.

Closes #5881 

---

Lookups like this happen when a pod is created with a host IP and host port set
in its spec. It still has a pod IP when running, but requests to
`hostIP:hostPort` will also be redirected to the pod. Combinations of host IP
and host Port are unique to the cluster and enforced by Kubernetes.

Currently, the destination services fails to find pods in this scenario because
we only keep an index with pod and their pod IPs, not pods and their host IPs.
To fix this, we now also keep an index of pods and their host IPs—if and only if
they have the host IP set.

Now when doing a pod lookup, we consider both the IP and the port. We perform
the following steps:

1. Do a lookup by IP in the pod podIP index
  - If only one pod is found then return it
2. 0 or more than 1 pods have the same pod IP
3. Do a lookup by IP in the pod hostIP index
  - If any number of pods were found, we know that IP maps to a node IP.
    Therefore, we search for a pod with a matching host Port. If one exists then
    return it; if not then there is no pod that matches `hostIP:port`
4. The IP does not map to a host IP
5. If multiple pods were found in `1`, then we know there are pods with
   conflicting podIPs and an error is returned
6. If no pounds were found in `1` then there is no pod that matches `IP:port`

---

Aside from the additional IP watcher test being added, this can be tested with
the following steps:

1. Create a kind cluster. kind is required because it's pods in `kube-system`
   have the same pod IPs; this not the case with k3d: `bin/kind create cluster`
2. Install Linkerd with `4445` marked as opaque: `linkerd install --set
   proxy.opaquePorts="4445" |kubectl apply -f -`
2. Get the node IP: `kubectl get -o wide nodes`
3. Pull my fork of `tcp-echo`:

```
$ git clone https://github.com/kleimkuhler/tcp-echo
...
$ git checkout --track kleimkuhler/host-pod-repro
```

5. `helm package .`
7. Install `tcp-echo` with the server not injected and correct host IP: `helm
   install tcp-echo tcp-echo-0.1.0.tgz --set server.linkerdInject="false" --set
   hostIP="..."`
8. Looking at the client's proxy logs, you should not observe any errors or
   protocol detection timeouts.
9. Looking at the server logs, you should see all the requests coming through
   correctly.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-03-18 13:29:43 -04:00
Riccardo Freixo 66b89f55e7
Fix named port resolution mid roll out (#5912) (#5911)
# Problem

While rolling out often not all pods will be ready in all the same set of
ports, leading the Kubernetes Endpoints API to return multiple subsets,
each covering a different set of ports, with the end result that the
same address gets repeated across subsets.

The old code for endpointsToAddresses would loop through all subsets, and the
later occurrences of an address would overwrite previous ones, with the
last one prevailing.

If the last subset happened to be for an irrelevant port, and the port to
be resolved is named, resolveTargetPort would resolve to port 0, which would
return port 0 to clients, ultimately leading linkerd-proxy to forward
connections to port 0.

This only happens if the pods selected by a service expose > 1 port, the
service maps to > 1 of these ports, and at least one of these ports is named.

# Solution

Never write an address to set of addresses if resolved port is 0, which
indicates named port resolution failed.

# Validation

Added a test case.

Signed-off-by: Riccardo Freixo <riccardofreixo@gmail.com>
2021-03-17 17:40:11 -04:00
Kevin Leimkuhler 1544d90150
dest: Reduce possible response size in destination service errors (#5893)
This reduces the possible HTTP response size from the destination service when
it encounters an error during a profile lookup.

If multiple objects on a cluster share the same IP (such as pods in
`kube-system`), the destination service will return an error with the two
conflicting pod yamls.

In certain cases, these pod yamls can be too large for the HTTP response and the
destination pod's proxy will indicate that with the following error:

```
hyper::proto::h2::server: send response error: user error: header too big
```

From the app pod's proxy, this results in the following error:

```
poll_profile: linkerd_service_profiles::client: Could not fetch profile error=status: Unknown, message: "http2 error: protocol error: unexpected internal error encountered"
```

We now only return the conflicting pods (or services) names. This reduces the
size of the returned error and fixes these warnings from occurring.

Example response error:

```
poll_profile: linkerd_service_profiles::client: Could not fetch profile error=status: FailedPrecondition, message: "Pod IP address conflict: kube-system/kindnet-wsflq, kube-system/kube-scheduler-kind-control-plane", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc", "date": "Fri, 12 Mar 2021 19:54:09 GMT"} }
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-03-16 13:28:05 -04:00
Tarun Pothulapati 5c1a375a51
destination: pass opaque-ports through cmd flag (#5829)
* destination: pass opaque-ports through cmd flag

Fixes #5817

Currently, Default opaque ports are stored at two places i.e
`Values.yaml` and also at `opaqueports/defaults.go`. As these
ports are used only in destination, We can instead pass these
values as a cmd flag for destination component from Values.yaml
and remove defaultPorts in `defaults.go`.

This means that users if they override `Values.yaml`'s opauePorts
field, That change is propogated both for injection and also
discovery like expected.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-03-01 16:00:20 +05:30
Kevin Leimkuhler 51a965e228
Return default opaque ports in the destination service (#5814)
This changes the destination service to always use a default set of opaque ports
for pods and services. This is so that after Linkerd is installed onto a
cluster, users can benefit from common opaque ports without having to annotate
the workloads that serve the applications.

After #5810 merges, the proxy containers will be have the default opaque ports
`25,443,587,3306,5432,11211`. This value on the proxy container does not affect
traffic though; it only configures the proxy.

In order for clients and servers to detect opaque protocols and determine opaque
transports, the pods and services need to have these annotations.

The ports `25,443,587,3306,5432,11211` are now handled opaquely when a pod or
service does not have the opaque ports annotation. If the annotation is present
with a different value, this is used instead of the default. If the annotation
is present but is an empty string, there are no opaque ports for the workload.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-24 14:55:31 -05:00
Kevin Leimkuhler 5bd5db6524
Revert "Rename multicluster annotation prefix and move when possible (#5771)" (#5813)
This reverts commit f9ab867cbc which renamed the
multicluster label name from `mirror.linkerd.io` to `multicluster.linkerd.io`.

While this change was made to follow similar namings in other extensions, it
complicates the multicluster upgrade process due to the secret creation.

`mirror.linkerd.io` is not that important of a label to change and this will
allow a smoother upgrade process for `stable-2.10.x`

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-24 12:54:52 -05:00
Kevin Leimkuhler ff93d2d317
Mirror opaque port annotations on services (#5770)
This change introduces an opaque ports annotation watcher that will send
destination profile updates when a service has its opaque ports annotation
change.

The user facing change introduced by this is that the opaque ports annotation is
now required on services when using the multicluster extension. This is because
the service mirror will create mirrored services in the source cluster, and
destination lookups in the source cluster need to discover that the workloads in
the target cluster are opaque protocols.

### Why

Closes #5650

### How

The destination server now has a new opaque ports annotation watcher. When a
client subscribes to updates for a service name or cluster IP, the `GetProfile`
method creates a profile translator stack that passes updates through resource
adaptors such as: traffic split adaptor, service profile adaptor, and now opaque
ports adaptor.

When the annotation on a service changes, the update is passed through to the
client where the `opaque_protocol` field will either be set to true or false.

A few scenarios to consider are:

  - If the annotation is removed from the service, the client should receive
    an update with no opaque ports set.
  - If the service is deleted, the stream stays open so the client should
    receive an update with no opaque ports set.
  - If the service has the annotation added, the client should receive that
    update.

### Testing

Unit test have been added to the watcher as well as the destination server.

An integration test has been added that tests the opaque port annotation on a
service.

For manual testing, using the destination server scripts is easiest:

```
# install Linkerd

# start the destination server
$ go run controller/cmd/main.go destination -kubeconfig ~/.kube/config

# Create a service or namespace with the annotation and inject it

# get the destination profile for that service and observe the opaque protocol field
$ go run controller/script/destination-client/main.go -method getProfile -path test-svc.default.svc.cluster.local:8080
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]                                              
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-23 13:36:17 -05:00
Kevin Leimkuhler f9ab867cbc
Rename multicluster annotation prefix and move when possible (#5771)
This renames the multicluster annotation prefix from `mirror.linkerd.io` to
`multicluster.linkerd.io` in order to reflect other extension naming patterns.

Additionally, it moves labels only used in the Multicluster extension into their
own labels file—again to reflect other extensions.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-18 17:10:33 -05:00
Kevin Leimkuhler 5dc662ae97
Remove namespace inheritance of opaque ports annotation (#5739)
This change removes the namespace inheritance of the opaque ports annotation.
Now when setting opaque port related fields in destination profile responses, we
only look at the pod annotations.

This prepares for #5736 where the proxy-injector will add the annotation from
the namespace if the pod does not have it already.

Closes #5735

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-15 10:21:20 -05:00
Oleh Ozimok c416e78261
destination: Fix crash when EndpointSlices are enabled (#5543)
The Destination controller can panic due to a nil-deref when
the EndpointSlices API is enabled.

This change updates the controller to properly initialize values
to avoid this segmentation fault.

Fixes #5521

Signed-off-by: Oleg Ozimok <oleg.ozimok@corp.kismia.com>
2021-01-15 12:52:11 -08:00
Filip Petkovski 40192e258a
Ignore pods with status.phase=Succeeded when watching IP addresses (#5412)
Ignore pods with status.phase=Succeeded when watching IP addresses

When a pod terminates successfully, some CNIs will assign its IP address
to newly created pods. This can lead to duplicate pod IPs in the same
Kubernetes cluster.

Filter out pods which are in a Succeeded phase since they are not 
routable anymore.

Fixes #5394

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-01-12 12:25:37 -05:00
Alejandro Pedraza d3d7f4e2e2
Destination should return `OpaqueTransport` hint when annotation matches resolved target port (#5458)
The destination service now returns `OpaqueTransport` hint when the annotation
matches the resolve target port. This is different from the current behavior
which always sets the hint when a proxy is present.

Closes #5421

This happens by changing the endpoint watcher to set a pod's opaque port
annotation in certain cases. If the pod already has an annotation, then its
value is used. If the pod has no annotation, then it checks the namespace that
the endpoint belongs to; if it finds an annotation on the namespace then it
overrides the pod's annotation value with that.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-01-05 14:54:55 -05:00
Kevin Leimkuhler e65f216d52
Add endpoint to GetProfile response (#5227)
Context: #5209

This updates the destination service to set the `Endpoint` field in `GetProfile`
responses.

The `Endpoint` field is only set if the IP maps to a Pod--not a Service.

Additionally in this scenario, the default Service Profile is used as the base
profile so no other significant fields are set.

### Examples

```
# GetProfile for an IP that maps to a Service
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.43.222.0:9090
INFO[0000] fully_qualified_name:"linkerd-prometheus.linkerd.svc.cluster.local"  retry_budget:{retry_ratio:0.2  min_retries_per_second:10  ttl:{seconds:10}}  dst_overrides:{authority:"linkerd-prometheus.linkerd.svc.cluster.local.:9090"  weight:10000}
```

Before:

```
# GetProfile for an IP that maps to a Pod
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20
INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}}
```


After:

```
# GetProfile for an IP that maps to a Pod
❯ go run controller/script/destination-client/main.go -method getProfile -path 10.42.0.20
INFO[0000] retry_budget:{retry_ratio:0.2  min_retries_per_second:10  ttl:{seconds:10}}  endpoint:{addr:{ip:{ipv4:170524692}}  weight:10000  metric_labels:{key:"control_plane_ns"  value:"linkerd"}  metric_labels:{key:"deployment"  value:"fast-1"}  metric_labels:{key:"pod"  value:"fast-1-5cc87f64bc-9hx7h"}  metric_labels:{key:"pod_template_hash"  value:"5cc87f64bc"}  metric_labels:{key:"serviceaccount"  value:"default"}  tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}}  protocol_hint:{h2:{}}}
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2020-11-18 15:41:25 -05:00