Commit Graph

13 Commits

Author SHA1 Message Date
Scott Fleener 156bf60ad7
feat(destination): introduce transport-protocol outbound TLS mode (#13699)
Non-opaque meshed traffic currently flows over the original destination port, which requires the inbound proxy to do protocol detection.

This adds an option to the destination controller that configures all meshed traffic to flow to the inbound proxy's inbound port. This will allow us to include more session protocol information in the future, obviating the need for inbound protocol detection.

This doesn't do much in the way of testing, since the default behavior should be unchanged. When this default changes, more validation will be done on the behavior here.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2025-03-05 13:51:21 -08:00
Oliver Gould aef8a02426
feat(destination): Add meshed HTTP/2 keep-alive settings (#12504)
This commit adds destination controller configuration that enables default
keep-alives for meshed HTTP/2 clients.

This is accomplished by encoding the raw protobuf message structure into the
helm values, and then encoding that as JSON in the destination controller's
command-line options. This allows operators to set any supported HTTP/2 client
configuration without having to modify the destination controller.
2024-04-30 19:35:30 +00:00
Alex Leong 5915ef5a18
Don't send endpoint profile updates from Server updates when opaqueness doesn't change (#12013)
When the destination controller receives an update for a Server resource, we recompute opaqueness ports for all pods.  This results in a large number of updates to all endpoint profile watches, even if the opaqueness doesn't change.  In cases where there are many Server resources, this can result in a large number of updates being sent to the endpoint profile translator and overflowing the endpoint profile translator update queue.  This is especially likely to happen during an informer resync, since this will result in an informer callback for every Server in the cluster.

We refactor the workload watcher to not send these updates if the opaqueness has not changed.

This, seemingly simple, change in behavior requires a large code change because:
* the current opaqueness state is not stored on workload publishers and must be added so that we can determine if the opaqueness has changed
* storing the opaqueness in addition to the other state we're storing (pod, ip, port, etc.) means that we are not storing all of the data represented by the Address struct
* workload watcher uses a `createAddress` func to dynamically create an Address from the state it stores
* now that we are storing the Address as state, creating Addresses dynamically is no longer necessary and we can operate on the Address state directly
  * this makes the workload watcher more similar to other watchers and follow a common pattern
  * it also fixes some minor correctness issues:
    * pods that did not have the ready status condition were being considered when they should not have been
    * updates to ExternalWorkload labels were not being considered

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-03-19 10:24:02 -07:00
Oliver Gould e1ab0a2e46
destination: Fix flakey TestEndpointProfileTranslator (#12182)
TestEndpointProfileTranslator could race against the consumer task
so that the final queue insertion would not necessarily fail. The send
buffer has been eliminated to avoid this race.
2024-02-29 08:59:44 -08:00
Alex Leong 65f13de2ce
Add support for ExternalWorkloads in endpoint profiles (#11952)
When a meshed client attempts to establish a connection directly to the workload IP of an ExternalWorkload, the destination controller should return an endpoint profile for that ExternalWorkload with a single endpoint and the metadata associated with that ExternalWorkload including:
* mesh TLS identity
* workload metric labels
* opaque / protocol hints

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-01-23 09:43:12 -08:00
Oliver Gould b0fee16fb4
destination: Fix GetProfile endpoint buffering (#11815)
In 71635cb and 357a1d3 we updated the endpoint and profile translators
to prevent backpressure from stalling the informer tasks. This change
updates the endpoint profile translator with the same fix, so that
updates are buffered and can detect when when a gRPC stream is stalled.

Furthermore, the update method is updated to use a protobuf-aware
comparison method instead of using serialization and string comparison.

A test is added for the endpoint profile translator, since none existed
previously.
2023-12-22 09:25:12 -08:00
Alex Leong a92d17dbe9
Fix error for profile lookups on unmeshed pods with port in default opaque list (#11550)
When we do a `GetProfile` lookup for an unmeshed pod, we set the `weightedAddr.ProtocolHint` to an empty value `&pb.ProtocolHint{}` to indicate that the address is unmeshed and has no protocol hint.  However, when the looked up port is in the default opaque list, we erroneously check if `weightedAddr.ProtocolHint != nil` to determine if we should attempt to get the inbound listen port for that pod.  Since `&pb.ProtocolHint{} != nil`, we attempt to get the inbound listen port for the unmeshed pod.  This results in an error, preventing any valid `GetProfile` responses from being returned.

We update the initialization logic for `weightedAddr.ProtocolHint` to only create a struct when a protocol hint is present and to leave it as `nil` if the pod is unmeshed.

We add a simple unit test for this behavior as well.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-12-20 13:56:49 -08:00
Alejandro Pedraza c7db3b3b80
Remove k8sAPI/metadataAPI dependency from endpointProfileTranslator (#11587)
This removes `endpointProfileTranslator`'s dependency on the k8sAPI and
metadataAPI, which are not used. This was introduced in one of #11334's
refactorings, but ended up being not required. No functional changes
here.
2023-11-14 09:04:36 -05:00
Alex Leong 8cf386355f
Fix GetProfiles error when address is opaque and unmeshed (#11556)
When we do a GetProfile lookup for an opaque port on an unmeshed pod,
we attempt to look up the inbound listen port of that pod's proxy. Since
that pod has no proxy, this fails and we return an error to the GetProfile
API call. This causes the proxy to fail to be able to resolve the profile and
be unable to route the traffic.

We revert to the previous behavior of only logging when we cannot look
up the inbound listen port instead of returning an error.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-11-02 06:59:39 -07:00
Alex Leong 4e7a588a2c
Add pod name to context token and logging (#11532)
When the destination controller logs about receiving or sending messages to a data plane proxy, there is no information in the log about which data plane pod it is communicating with.  This can make it difficult to diagnose issues which span the data plane and control plane.

We add a `pod` field to the context token that proxies include in requests to the destination controller.  We add this pod name to the logging context so that it shows up in log messages.  In order to accomplish this, we had to plumb through logging context in a few places where it previously had not been.  This gives us a more complete logging context and more information in each log message.

An example log message with this fuller logging context is:

```
time="2023-10-24T00:14:09Z" level=debug msg="Sending destination add: add:{addrs:{addr:{ip:{ipv4:183762990}  port:8080}  weight:10000  metric_labels:{key:\"control_plane_ns\"  value:\"linkerd\"}  metric_labels:{key:\"deployment\"  value:\"voting\"}  metric_labels:{key:\"pod\"  value:\"voting-7475cb974c-2crt5\"}  metric_labels:{key:\"pod_template_hash\"  value:\"7475cb974c\"}  metric_labels:{key:\"serviceaccount\"  value:\"voting\"}  tls_identity:{dns_like_identity:{name:\"voting.emojivoto.serviceaccount.identity.linkerd.cluster.local\"}}  protocol_hint:{h2:{}}}  metric_labels:{key:\"namespace\"  value:\"emojivoto\"}  metric_labels:{key:\"service\"  value:\"voting-svc\"}}" addr=":8086" component=endpoint-translator context-ns=emojivoto context-pod=web-767f4484fd-wmpvf remote="10.244.0.65:52786" service="voting-svc.emojivoto.svc.cluster.local:8080"
```

Note the `context-pod` field.

Additionally, we have tested this when no pod field is included in the context token (e.g. when handling requests from a pod which does not yet add this field) and confirmed that the `context-pod` log field is empty, but no errors occur.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-25 13:48:42 -07:00
Alejandro Pedraza 65ddba4e5d
dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes (#11334)
Followup to #11328

Implements a new pod watcher, instantiated along the other ones in the Destination server. It also watches on Servers and carries all the logic from ServerWatcher, which has now been decommissioned.

The `CreateAddress()` function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. `getAnnotatedOpaquePorts()` got also moved for similar reasons.

Other things to note about PodWatcher:

- It publishes a new pair of metrics `ip_port_subscribers` and `ip_port_updates` leveraging the framework in `prometheus.go`.
- The complexity in `updatePod()` is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event.
- 
Finally, endpointProfileTranslator's `endpoint` (*pb.WeightedAddr) not being a static object anymore, the `Update()` function now receives an Address that allows it to rebuild the endpoint on the fly (and so `createEndpoint()` was converted into a method of endpointProfileTranslator).
2023-09-28 08:57:52 -05:00
Kevin Leimkuhler 67bcd8f642
Add `gosec` and `errcheck` lints (#7954)
Closes #7826

This adds the `gosec` and `errcheck` lints to the `golangci` configuration. Most significant lints have been fixed my individual changes, but this enables them by default so that all future changes are caught ahead of time.

A significant amount of these lints are been exluced by the various `exclude-rules` rules added to `.golangci.yml`. These include operations are files that generally do not fail such as `Copy`, `Flush`, or `Write`. We also choose to ignore most errors when cleaning up functions via the `defer` keyword.

Aside from those, there are several other rules added that all have comments explaining why it's okay to ignore the errors that they cover.

Finally, several smaller fixes in the code have been made where it seems necessary to catch errors or at least log them.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-03-03 10:09:51 -07:00
Kevin Leimkuhler 147d85dc70
Update `GetProfile` clients with policy server updates (#7388)
### What

`GetProfile` clients do not receive destinatin profiles that consider Server protocol fields the way that `Get` clients do. If a Server exists for a `GetProfile` destination that specifies the protocol for that destination is `opaque`, this information is not passed back to the client.

#7184 added this for `Get` by subscribing clients to Endpoint/EndpointSlice updates. When there is an update, or there is a Server update, the endpoints watcher passes this information back to the endpoint translator which handles sending the update back to the client.

For `GetProfile` the situation is different. As with `Get`, we only consider Servers when dealing with Pod IPs, but this only occurs in two situations for `GetProfile`.

1. The destination is a Pod IP and port
2. The destionation is an Instance ID and port

In both of these cases, we need to check if a already Server selects the endpoint and we need to subscribe for Server updates incase one is added or deleted which selects the endpoint.

### How

First we check if there is already a Server which selects the endpoint. This is so that when the first destionation profile is returned, the client knows if the destination is `opaque` or not.

After sending that first update, we then subscribe the client for any future updates which will come from a Server being added or deleted.

This is handled by the new `ServerWatcher` which watches for Server updates on the cluster; when an update occurs it sends that to the `endpointProfileTranslator` which translates the protcol update into a DestinationProfile.

By introducing the `endpointProfileTranslator` which only handles protocol updates, we're able to decouple the endpoint logic from `profileTranslator`—it's `endpoint` field has been removed now that it only handles updates for ServiceProfiles for Services.

### Testing

A unit test has been added and below are some manual testing instructions to see how it interacts with Server updates:

<details>
	<summary>app.yaml</summary>

	```yaml
	apiVersion: v1
	kind: Pod
	metadata:
	  name: pod
	  labels:
		app: pod
	spec:
	  containers:
	  - name: app
		image: nginx
		ports:
		  - name: http
			containerPort: 80
	---
	apiVersion: policy.linkerd.io/v1beta1
	kind: Server
	metadata:
	  name: srv
	  labels:
		policy: srv
	spec:
	  podSelector:
		matchLabels:
		  app: pod
	  port: 80
	  proxyProtocol: opaque
	```
</details>

```shell
$ go run ./controller/cmd/main.go destination
```

```shell
$ linkerd inject app.yaml |kubectl apply -f -
...
$ kubectl get pods -o wide
NAME   READY   STATUS    RESTARTS   AGE   IP           NODE                       NOMINATED NODE   READINESS GATES
pod    2/2     Running   0          53m   10.42.0.34   k3d-k3s-default-server-0   <none>           <none>
$ go run ./controller/script/destination-client/main.go -method getProfile -path 10.42.0.34:80
...
```

You can add/delete `srv` as well as edit its `proxyProtocol` field to observe the correct DestinationProfile updates.

Signed-off-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2021-12-08 12:26:27 -07:00