Commit Graph

28 Commits

Author SHA1 Message Date
Alejandro Pedraza f8f73c47c7
Fix destination controller 2024-04-23 15:05:11 -05:00
occupyhabit 6eeaea4d94
chore: Remove repetitive words (#12330)
Signed-off-by: occupyhabit <wangmengjiao@outlook.com>
2024-03-25 09:33:39 -07:00
Alejandro Pedraza b697e285a0
Refactor IPv4-only functions to also work for IPv6 (#12303)
The main change here is the refactoring of the address functions in `addr.go` that support the Destination controller and Viz's Tap controller. Some of those functions only worked for IPv4, so this change refactored them to make them IP family agnostic.
This enabled adding (and fixing) IPv6 unit tests as detailed in the following sections.

Other changes:

- The `ProxyAddressesToString()` function was no longer used, so it got removed.
- The `ProxyIPToString()` function was only used by the destination-client script, so that got stripped out.

## `addr_test.go`

We added IPv6 cases to each test, that would have failed previously.

## `endpoint_translator_test.go`

One of the test pods (pod3) was changed to have an IPv6. Without the other changes in this PR those tests would still have passed, but just because when comparing actual IPs with expected ones we weren't checking if they were both zero. So here we added checks against that.

## `server_test.go`

As above, we added checks against empty IPs. And in the mocked resources in `test_util.go` we added an IPv6 EndpointSlice.
2024-03-22 07:20:52 -05:00
Alex Leong 5915ef5a18
Don't send endpoint profile updates from Server updates when opaqueness doesn't change (#12013)
When the destination controller receives an update for a Server resource, we recompute opaqueness ports for all pods.  This results in a large number of updates to all endpoint profile watches, even if the opaqueness doesn't change.  In cases where there are many Server resources, this can result in a large number of updates being sent to the endpoint profile translator and overflowing the endpoint profile translator update queue.  This is especially likely to happen during an informer resync, since this will result in an informer callback for every Server in the cluster.

We refactor the workload watcher to not send these updates if the opaqueness has not changed.

This, seemingly simple, change in behavior requires a large code change because:
* the current opaqueness state is not stored on workload publishers and must be added so that we can determine if the opaqueness has changed
* storing the opaqueness in addition to the other state we're storing (pod, ip, port, etc.) means that we are not storing all of the data represented by the Address struct
* workload watcher uses a `createAddress` func to dynamically create an Address from the state it stores
* now that we are storing the Address as state, creating Addresses dynamically is no longer necessary and we can operate on the Address state directly
  * this makes the workload watcher more similar to other watchers and follow a common pattern
  * it also fixes some minor correctness issues:
    * pods that did not have the ready status condition were being considered when they should not have been
    * updates to ExternalWorkload labels were not being considered

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-03-19 10:24:02 -07:00
Matei David 98e38a66b6
Rename meshTls to meshTLS in ExternalWorkload CRD (#12098)
The ExternalWorkload resource we introduced has a minor naming
inconsistency; `Tls` in `meshTls` is not capitalised. Other resources
that we have (e.g. authentication resources) capitalise TLS (and so does
Go, it follows a similar naming convention).

We fix this in the workload resource by changing the field's name and
bumping the version to `v1beta1`.

Upgrading the control plane version will continue to work without
downtime. However, if an existing resource exists, the policy controller
will not completely initialise. It will not enter a crashloop backoff,
but it will also not become ready until the resource is edited or
deleted.

Signed-off-by: Matei David <matei@buoyant.io>
2024-02-20 11:00:13 -08:00
Oliver Gould 2ab76b64c6
destination: Rename zone weighting flag to ext-endpoint-zone-weights (#12090) 2024-02-16 09:06:56 -05:00
Alex Leong 5aef29ace1
Update workload watcher and server tests to use EndpointSlices (#12054)
Fixes #12032

The Destination controller server tests test the destination server with `enableEndpointSlices=false`.  The default for this value is true, meaning that these tests do not test the default configuration.

We update the tests to test with `enableEndpointSlices=true` and update the corresponding mock kubernetes Endpoints resources to be EndpointSlices instead.  We also fix an instance where the workload watcher was using Endpoints even when in EndpointSlices mode.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-02-09 11:33:06 -08:00
Alex Leong 53287d40a8
Reflect changes to server selector in opaqueness (#12031)
Fixes #11995

If a Server is marking a Pod's port as opaque and then the Server's podSelector is updated to no longer select that Pod, then the Pod's port should no longer be marked as opaque. However, this update does not result in any updates from the destination API's Get stream and the port remains marked as opaque.

We fix this by updating the endpoint watcher's handling of Server updates to consider both the old and the new Server.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-02-05 12:25:16 -08:00
Alex Leong 0261e4f086
Return INVALID_ARGUMENT status codes properly from GetProfile (#11980)
When the destination controller receives a GetProfile request for an ExternalName service, it should return gRPC status code INVALID_ARGUMENT to signal to the proxy that ExternalName services do not have endpoints and service discovery should not be used.  However, the destination controller is returning gRPC status code UNKNOWN instead.  This causes the proxy to retry these requests, resulting in a flurry of warning logs in the destination controller.

We fix the destination controller to properly return INVALID_ARGUMENT instead of UNKNOWN for ExternalName services.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-01-26 09:11:00 -08:00
Alex Leong 65f13de2ce
Add support for ExternalWorkloads in endpoint profiles (#11952)
When a meshed client attempts to establish a connection directly to the workload IP of an ExternalWorkload, the destination controller should return an endpoint profile for that ExternalWorkload with a single endpoint and the metadata associated with that ExternalWorkload including:
* mesh TLS identity
* workload metric labels
* opaque / protocol hints

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-01-23 09:43:12 -08:00
Zahari Dichev 391ce919f5
policy: regenerate Server go bindings (#11920)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-01-15 11:09:31 +02:00
Oliver Gould b0fee16fb4
destination: Fix GetProfile endpoint buffering (#11815)
In 71635cb and 357a1d3 we updated the endpoint and profile translators
to prevent backpressure from stalling the informer tasks. This change
updates the endpoint profile translator with the same fix, so that
updates are buffered and can detect when when a gRPC stream is stalled.

Furthermore, the update method is updated to use a protobuf-aware
comparison method instead of using serialization and string comparison.

A test is added for the endpoint profile translator, since none existed
previously.
2023-12-22 09:25:12 -08:00
Oliver Gould 5e558ae3e5
destination: Add optional experimental endpoint weighting (#11795)
This change adds a runtime flag to the destination controller,
--experimental-endpoint-zone-weights=true, that causes endpoints in the
local zone to receive higher weights. This feature is disabled by
default, since the weight value is not honored by proxies. No helm
configuration is exposed yet, either.

This weighting is instrumented in the endpoint translator. Tests are
added to confirm that the behavior is feature-gated.

Additionally, this PR adds the "zone" metric label to endpoint metadata
responses.
2023-12-20 13:11:30 -08:00
Alex Leong 71635cbf3d
Add queuing to profile translator (#11546)
https://github.com/linkerd/linkerd2/pull/11491 changed the EndpointTranslator to use a queue to avoid calling `Send` on a gRPC stream directly from an informer callback goroutine.  This change updates the ProfileTranslator in the same way, adding a queue to ensure we do not block the informer thread.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-11-09 07:32:30 -05:00
Oliver Gould 8cd165a1e4
destination: Index pods and service with multiple IPs (#11515)
The destination controller indexes Pods by their primary PodIP and their
primary HostIP (when applicable); and it indexes Services by their
primary ClusterIP.

In preparation for dual-stack (IPv6) support, this change updates the
destination controller indexers to consume all IPs for these resources.
Tests are updated to demonstrate a dual-stack setup (where these
resources have a primary IPv4 address and a secondary IPv6 address).

While exercising these tests, I had to replace the brittle host:port
parsing logic we use in the server, in favor of Go's standard
`net.SplitHostPort` utility.
2023-10-23 14:55:33 -07:00
Alex Leong 357a1d32b2
Add update queue to endpoint translator (#11491)
When a grpc client of the destination.Get API initiates a request but then doesn't read off of that stream, the HTTP2 stream flow control window will fill up and eventually exert backpressure on the destination controller.  This manifests as calls to `Send` on the stream blocking.  Since `Send` is called synchronously from the client-go informer callback (by way of the endpoint translator), this blocks the informer callback and prevents all further informer calllbacks from firing.  This causes the destination controller to stop sending updates to any of its clients.

We add a queue in the endpoint translator so that when it gets an update from the informer callback, that update is queued and we avoid potentially blocking the informer callback.  Each endpoint translator spawns a goroutine to process this queue and call `Send`.  If there is not capacity in this queue (e.g. because a client has stopped reading and we are experiencing backpressure) then we terminate the stream.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-18 12:34:38 -07:00
Alejandro Pedraza c67985def0
Extend unit test for HostPort subscriptions (#11439)
Followup to https://github.com/linkerd/linkerd2/pull/11334#issuecomment-1736093592

This extends the test introduced in #11334 to excercise upgrading a
Server associated to a pod's HostPort, and observing how the stream
updates the OpaqueProtocol field.

Helper functions were refactored a bit to allow retrieving the
l5dCRDClientSet used when building the fake API.
2023-10-02 13:51:19 -05:00
Alejandro Pedraza 65ddba4e5d
dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes (#11334)
Followup to #11328

Implements a new pod watcher, instantiated along the other ones in the Destination server. It also watches on Servers and carries all the logic from ServerWatcher, which has now been decommissioned.

The `CreateAddress()` function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. `getAnnotatedOpaquePorts()` got also moved for similar reasons.

Other things to note about PodWatcher:

- It publishes a new pair of metrics `ip_port_subscribers` and `ip_port_updates` leveraging the framework in `prometheus.go`.
- The complexity in `updatePod()` is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event.
- 
Finally, endpointProfileTranslator's `endpoint` (*pb.WeightedAddr) not being a static object anymore, the `Update()` function now receives an Address that allows it to rebuild the endpoint on the fly (and so `createEndpoint()` was converted into a method of endpointProfileTranslator).
2023-09-28 08:57:52 -05:00
Alejandro Pedraza 3d1a3e018c
dst: Stop overriding Host IP with Pod IP on HostPort lookup (#11328)
* stopgap fix for hostport staleness

## Problem

When there's a pod with a `hostPort` entry, `GetProfile` requests
targetting the host's IP and that `hostPort` return an endpoint profile
with that pod's IP and `containerPort`. If that pod vanishes and another
one in that same host with that same `hostPort` comes up, the existing
`GetProfile` streams won't get updated with the new pod information
(metadata, identity, protocol).

That breaks the connectivity of the client proxy relying on that stream.

## Partial Solution

It should be less surprising for those `GetProfile` requests to return
an endpoint profile with the same host IP and port requested, and leave
to the cluster's CNI to peform the translation to the corresponding pod
IP and `containerPort`.

This PR performs that change, but continuing returning the corresponding
pod's information alongside.

If the pod associated to that host IP and port changes, the client proxy
won't loose connectivity, but the pod's information won't get updated
(that'll be fixed in a separate PR).

A new unit test validating this has been added, which will be expanded
to validate the changed pod information when that gets implemented.

## Details of Change

- We no longer do the HostPort->ContainerPort conversion, so the
  `getPortForPod` function was dropped.
- The `getPodByIp` function will now be split in two: `getPodByPodIP`
  and `getPodByHostIP`, the latter being called only if the former
  doesn't return anything.
- The `createAddress` function is now simplified in that it just uses
  the passed IP to build the address. The passed IP will depend on which
  of the two functions just mentioned returned the pod (host IP or pod
  IP)
2023-09-06 10:35:14 -05:00
Alex Leong db2e543b0c
Disable local traffic policy for remote discovery (#11257)
When a service has it's internal traffic policy set to "local", we will perform filtering to only return local endpoints, as-per the ForZone hints in the endpoints. However, ForZone calculations do not take resources from remote clusters into account, therefore this type of filtering is not appropriate for remote discovery services.

We explicitly ignore any internal traffic policy when doing remote discovery.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-08-16 15:27:58 -07:00
Alex Leong 368b63866d
Add support for remote discovery (#11224)
Adds support for remote discovery to the destination controller.

When the destination controller gets a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` label, this is an indication that the destination controller should discover the endpoints for this service from a remote cluster.  The destination controller will look for a remote cluster which has been linked to it (using the `linkerd multicluster link` command) with that name.  It will look at the `multicluster.linkerd.io/remote-discovery` label for the service name to look up in that cluster.  It then streams back the endpoint data for that remote service.

Since we now have multiple client-go informers for the same resource types (one for the local cluster and one for each linked remote cluster) we add a `cluster` label onto the prometheus metrics for the informers and EndpointWatchers to ensure that each of these components' metrics are correctly tracked and don't overwrite each other.

---------

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-08-11 09:31:45 -07:00
Oliver Gould 08f97cc26f
destination: Avoid sending spurious profile updates (#10517)
The GetProfile API endpoint does not behave as expected: when a profile
watch is established, the API server starts two separate profile
watches--a primary watch with the client's namespace and fallback watch
ignoring the client's namespace. These watches race to send data back to
the client. If the backup watch updates first, it may be sent to clients
before being corrected by a subsequent update. If the primary watch
updates with an empty value, the default profile may be served before
being corrected by an update to the backup watch.

From the proxy's perspective, we'd much prefer that the API provide a
single authoritative response when possible. It avoids needless
corrective work from distributing across the system on every watch
initiation.

To fix this, we modify the fallbackProfileListener to behave
predictably: it only emits updates once both its primary and fallback
listeners have been updated. This avoids emitting updates based on a
partial understanding of the cluster state.

Furthermore, the opaquePortsAdaptor is updated to avoid synthesizing a
default serviceprofile (surprising behavior) and, instead, this
defaulting logic is moved into a dedicated defaultProfileListener
helper. A dedupProfileListener is added to squelch obviously redundant
updates.

Finally, this newfound predictability allows us to simplify the API's
tests. Many of the API tests are not clear in what they test and
sometimes make assertions about the "incorrect" profile updates.
2023-03-13 13:36:18 -07:00
Alejandro Pedraza 4a84f2cb32
Implement the k8s metadata API in the Destination controller (#10326)
Fixes #9986

After reviewing the k8s API calls in Destination, it was concluded we
could only swap out the calls to the Node and RS resources to use the
metadata API, as all the other resources (Endpoints, EndpointSlices,
Services, Pod, ServiceProfiles, Server) required fields other than those
found in their metadata section.

This also required completing the `NewFakeAPI` implementation by adding
the missing annotations and labels entries.

## Testing Memory Consumption

The gains here aren't as big as in #9650. In order to test this we need
to push hard and create 4000 RS:

``` bash
for i in {0..4000}; do kubectl create deployment test-pod-$i --image=nginx; done
```

In edge-23.2.1 the destination pod's memory consumption goes from 40Mi
to 160Mi after all the RS were created. With this change, it went from
37Mi to 140Mi.
2023-02-13 17:30:07 -05:00
dependabot[bot] 62d6d7cd52
build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0 (#10038)
* build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0

Bumps [sigs.k8s.io/gateway-api](https://github.com/kubernetes-sigs/gateway-api) from 0.5.1 to 0.6.0.
- [Release notes](https://github.com/kubernetes-sigs/gateway-api/releases)
- [Changelog](https://github.com/kubernetes-sigs/gateway-api/blob/main/CHANGELOG.md)
- [Commits](https://github.com/kubernetes-sigs/gateway-api/compare/v0.5.1...v0.6.0)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/gateway-api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Account for possible errors returned from `AddEventHandler`

In v0.26.0 client-go's `AddEventHandler` method for informers started
returning a registration handle (that we ignore) and an error that we
now surface up.

* client-go v0.26.0 removed the openstack plugin

* Temporary changes to trigger tests in k8s 1.21

- Adds an innocuous change to integration.yml so that all tests get
  triggered
- Hard-code k8s version in `k3d cluster create` invocation to v1.21

* Revert "Temporary changes to trigger tests in k8s 1.21"

This reverts commit 3e1fdd0e5e.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2023-01-16 09:38:09 -05:00
Alejandro Pedraza 7428d4aa51
Removed dupe imports (#10049)
* Removed dupe imports

My IDE (vim-gopls) has been complaining for a while, so I decided to take
care of it. Found via
[staticcheck](https://github.com/dominikh/go-tools)

* Add stylecheck to go-lint checks
2023-01-10 14:34:56 -05:00
Alejandro Pedraza 52ae875e9d
Fixes HostPort mapping lookup that was generating a false warning (#9918)
When performing the HostPort mapping introduced in #9819, the `containsIP` iterates through the pod IPs searching for a match against `targetIP` using `ip.String()`, but that returns something like `&PodIP{IP: xxx}`. Fixed that to just use `ip.IP`, and also completed the text fixtures to include both `PodIP` and `PodIPs` in the pods manifests.

Note this wasn't affecting the end result, it was just producing an extra warning as shown below, that this change eliminates:

```bash
$ go test -v ./controller/api/destination/... -run TestGetProfiles
=== RUN   TestGetProfiles
...
=== RUN   TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS
time="2022-11-29T09:38:48-05:00" level=info msg="waiting for caches to sync"
time="2022-11-29T09:38:49-05:00" level=info msg="caches synced"
time="2022-11-29T09:38:49-05:00" level=warning msg="unable to find container port as host (172.17.13.15) matches neither PodIP nor HostIP (&Pod{ObjectMeta:{pod-0  ns    0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[linkerd.io/control-plane-ns:linkerd] map[] [] [] []},Spec:PodSpec{Volumes:[]Volume{},Containers:[]Container{},RestartPolicy:,TerminationGracePeriodSeconds:nil,ActiveDeadlineSeconds:nil,DNSPolicy:,NodeSelector:map[string]string{},ServiceAccountName:,DeprecatedServiceAccount:,NodeName:,HostNetwork:false,HostPID:false,HostIPC:false,SecurityContext:nil,ImagePullSecrets:[]LocalObjectReference{},Hostname:,Subdomain:,Affinity:nil,SchedulerName:,InitContainers:[]Container{},AutomountServiceAccountToken:nil,Tolerations:[]Toleration{},HostAliases:[]HostAlias{},PriorityClassName:,Priority:nil,DNSConfig:nil,ShareProcessNamespace:nil,ReadinessGates:[]PodReadinessGate{},RuntimeClassName:nil,EnableServiceLinks:nil,PreemptionPolicy:nil,Overhead:ResourceList{},TopologySpreadConstraints:[]TopologySpreadConstraint{},EphemeralContainers:[]EphemeralContainer{},SetHostnameAsFQDN:nil,OS:nil,HostUsers:nil,},Status:PodStatus{Phase:Running,Conditions:[]PodCondition{},Message:,Reason:,HostIP:,PodIP:172.17.13.15,StartTime:<nil>,ContainerStatuses:[]ContainerStatus{},QOSClass:,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{},EphemeralContainerStatuses:[]ContainerStatus{},},})" test=TestGetProfiles/Return_profile_with_endpoint_when_using_pod_DNS
```
2022-11-29 16:33:31 -05:00
Steve Jenson 791c6a77d7
Follows the HostPort mapping when a request for a pod comes in on node network (#9819)
Maps the request port to the container's port if the request comes in from the node network and has a hostPort mapping.

Problem:

When a request for a container comes in from the node network, the node port is used ignoring the hostPort mapping.

Solution:

When a request is seen coming from the node network, get the container Port from the Spec.

Validation:

Fixed an existing unit test and wrote a new one driving GetProfile specifically.

Fixes #9677 

Signed-off-by: Steve Jenson <stevej@buoyant.io>
2022-11-23 11:58:06 -05:00
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00