Commit Graph

37 Commits

Author SHA1 Message Date
Zahari Dichev f57137b121
fix(dest): fallback to default proxy inbound port when one could not be discovered on an ExternalWorkload (#13840)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2025-03-21 15:25:15 +02:00
Scott Fleener 5d0275e3f3
feat(dest): Default meshed traffic to inbound proxy port (#13715)
A follow up to https://github.com/linkerd/linkerd2/pull/13699, this default-enables the config option introduced in that PR. Now, all traffic between meshed pods should flow to the proxy's inbound port.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2025-03-11 15:25:40 -07:00
Scott Fleener 05d48f6a52
fix(destination): Do not send admin traffic over opaque transport (#13758)
Traffic that is meant for the destination workload can be sent over the opaque transport without issue. However, traffic intended for the proxy itself (metrics scraping, tap) need to be sent directly to the corresponding proxy port to prevent them from being forwarded to the workflow.

This adds in special cases for the admin and control ports, read directly from the environment variables on the pods, that excludes them from being sent over opaque transport.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2025-03-11 12:15:10 -07:00
Scott Fleener 156bf60ad7
feat(destination): introduce transport-protocol outbound TLS mode (#13699)
Non-opaque meshed traffic currently flows over the original destination port, which requires the inbound proxy to do protocol detection.

This adds an option to the destination controller that configures all meshed traffic to flow to the inbound proxy's inbound port. This will allow us to include more session protocol information in the future, obviating the need for inbound protocol detection.

This doesn't do much in the way of testing, since the default behavior should be unchanged. When this default changes, more validation will be done on the behavior here.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2025-03-05 13:51:21 -08:00
Scott Fleener 958cfca666
Export zone locality in outbound destination metrics (#13129)
Currently, we don't have a simple way of checking if the endpoint a proxy is discovering is in the same zone or not.

This adds a "zone_locality" metric label to the outbound destination address metrics. Note that this does not increase the cardinality of the related metrics, as this label doesn't vary within an endpoint.

Validated by checking the prometheus metrics on a local cluster and verifying this label appears in the outbound transport metrics.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2024-10-15 13:43:05 -07:00
Alejandro Pedraza 137eac9df3
Add IPv6 support for the destination controller (#12428)
Services in dual-stack mode result in the creation of two EndpointSlices, one for each IP family. Before this change, the Get Destination API would nondeterministically return the address for any of those ES, depending on which one was processed last by the controller because they would overwrite each other.

As part of the ongoing effort to support IPv6/dual-stack networks, this change fixes that behavior giving preference to IPv6 addresses whenever a service exposes both families.

There are a new set of unit tests in server_ipv6_test.go, and in the TestEndpointTranslatorForPods tests there's a couple of new cases to test the interaction with zone filtering.
Also the server unit tests were updated to segregate the tests and resources dealing with the IPv4/IPv6/dual-stack cases.
2024-05-02 14:39:05 -05:00
Alejandro Pedraza 7cbe2f5ca6
Enable forwarding IPv6 connections through the proxy (#12495)
As part of the ongoing effort to support IPv6/dual-stack networks, this change
enables the proxy to properly forward IPv6 connections:

- Adds the new `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS` environment variable when
  injecting the proxy. This is supported as of proxy v2.228.0 which was just
  pulled into the linkerd2 repo in #2d5085b56e465ef56ed4a178dfd766a3e16a631d.
  This adds the IPv6 loopback address (`[::1]`) to the IPv4 one (`127.0.0.1`)
  so the proxy can forward outbound connections received via IPv6. The injector
  will still inject `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR` to support the rare
  case where the `proxy.image.version` value is overridden with an older
  version. The new proxy still considers that variable, but it's superseded by
  the new one. The old variable is considered deprecated and should be removed
  in the future.
- The values for `LINKERD2_PROXY_CONTROL_LISTEN_ADDR`,
  `LINKERD2_PROXY_ADMIN_LISTEN_ADDR` and `LINKERD2_PROXY_INBOUND_LISTEN_ADDR`
  have been updated to point to the IPv6 wildcard address (`[::]`) instead of
  the IPv4 one (`0.0.0.0`) for the same reason. Unlike with the loopback
  address, the IPv6 wildcard address suffices to capture both IPv4 and IPv6
  traffic.
- The endpoint translator's `getInboundPort()` has been updated to properly
  parse the IPv6 loopback address retrieved from the proxy container manifest.
  A unit test was added to validate the behavior.
2024-05-02 16:39:19 +01:00
Alex Leong 2c2a96bc73
Removes should not change local traffic policy (#12325)
Fixes: #12311

When the endpoint translator receives a `remove` call, it was updating it's local traffic policy based on the address set passed to remove.  However, since `remove` is only meant to remove addresses and not change the address metadata, the endpoints watcher was not setting local traffic policy on these calls to `remove`.  This can result in calls to `remove` temporarily turning off local traffic policy which will cause non-local addresses to be sent to clients.

Since `remove` should not change address metadata, we now disregard any metadata in the call to `remove`, including any changes to the local traffic policy.

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2024-03-22 09:42:13 -07:00
Alejandro Pedraza b697e285a0
Refactor IPv4-only functions to also work for IPv6 (#12303)
The main change here is the refactoring of the address functions in `addr.go` that support the Destination controller and Viz's Tap controller. Some of those functions only worked for IPv4, so this change refactored them to make them IP family agnostic.
This enabled adding (and fixing) IPv6 unit tests as detailed in the following sections.

Other changes:

- The `ProxyAddressesToString()` function was no longer used, so it got removed.
- The `ProxyIPToString()` function was only used by the destination-client script, so that got stripped out.

## `addr_test.go`

We added IPv6 cases to each test, that would have failed previously.

## `endpoint_translator_test.go`

One of the test pods (pod3) was changed to have an IPv6. Without the other changes in this PR those tests would still have passed, but just because when comparing actual IPs with expected ones we weren't checking if they were both zero. So here we added checks against that.

## `server_test.go`

As above, we added checks against empty IPs. And in the mocked resources in `test_util.go` we added an IPv6 EndpointSlice.
2024-03-22 07:20:52 -05:00
Matei David 98e38a66b6
Rename meshTls to meshTLS in ExternalWorkload CRD (#12098)
The ExternalWorkload resource we introduced has a minor naming
inconsistency; `Tls` in `meshTls` is not capitalised. Other resources
that we have (e.g. authentication resources) capitalise TLS (and so does
Go, it follows a similar naming convention).

We fix this in the workload resource by changing the field's name and
bumping the version to `v1beta1`.

Upgrading the control plane version will continue to work without
downtime. However, if an existing resource exists, the policy controller
will not completely initialise. It will not enter a crashloop backoff,
but it will also not become ready until the resource is edited or
deleted.

Signed-off-by: Matei David <matei@buoyant.io>
2024-02-20 11:00:13 -08:00
Oliver Gould 2ab76b64c6
destination: Rename zone weighting flag to ext-endpoint-zone-weights (#12090) 2024-02-16 09:06:56 -05:00
Zahari Dichev 027d49a9a6
discovery: handle endpoint slices from ExternalWorkload (#11939)
This alters the endpoints slices watcher to handle slices that reference ExternalWorkloads.

Testing
Add the following resources: 

```yaml
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
addressType: IPv4
metadata:
  name: my-external-workload
  namespace: mixed-env
  labels:
    kubernetes.io/service-name: test-1
endpoints:
- addresses:
  - 172.21.0.5
  conditions:
    ready: true
    serving: true
    terminating: false
  targetRef:
    kind: ExternalWorkload
    name: my-external-workload
ports:
- port: 8080
  name: http
---
apiVersion: workload.linkerd.io/v1alpha1
kind: ExternalWorkload
metadata:
  name: my-external-workload
  namespace: mixed-env
  labels:
    app: test
spec:
  meshTls:
    identity: "test"
    serverName: "test"
  workloadIPs:
  - ip: 172.21.0.5
  ports:
  - port: 8080
    name: http
---
apiVersion: v1
kind: Service
metadata:
  name: test-1
  namespace: mixed-env
spec:
  selector:
    app: test
  type: ClusterIP
  ports:
  - name: http
    port: 8080
    targetPort: 8080
    protocol: TCP

```

Observe endpoints:
```
linkerd dg endpoints test-1.mixed-env.svc.cluster.local:8080
```

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2024-01-17 15:43:20 -08:00
Oliver Gould 5e558ae3e5
destination: Add optional experimental endpoint weighting (#11795)
This change adds a runtime flag to the destination controller,
--experimental-endpoint-zone-weights=true, that causes endpoints in the
local zone to receive higher weights. This feature is disabled by
default, since the weight value is not honored by proxies. No helm
configuration is exposed yet, either.

This weighting is instrumented in the endpoint translator. Tests are
added to confirm that the behavior is feature-gated.

Additionally, this PR adds the "zone" metric label to endpoint metadata
responses.
2023-12-20 13:11:30 -08:00
Alex Leong 357a1d32b2
Add update queue to endpoint translator (#11491)
When a grpc client of the destination.Get API initiates a request but then doesn't read off of that stream, the HTTP2 stream flow control window will fill up and eventually exert backpressure on the destination controller.  This manifests as calls to `Send` on the stream blocking.  Since `Send` is called synchronously from the client-go informer callback (by way of the endpoint translator), this blocks the informer callback and prevents all further informer calllbacks from firing.  This causes the destination controller to stop sending updates to any of its clients.

We add a queue in the endpoint translator so that when it gets an update from the informer callback, that update is queued and we avoid potentially blocking the informer callback.  Each endpoint translator spawns a goroutine to process this queue and call `Send`.  If there is not capacity in this queue (e.g. because a client has stopped reading and we are experiencing backpressure) then we terminate the stream.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-10-18 12:34:38 -07:00
Eliza Weisman ed4d240e36
destination: send `Opaque` protocol hint for opaque ports (#10301)
The outbound proxy handles endpoints with the `opaque_transport` flag by
opening a direct connection to the inbound proxy's inbound listener
port, and sending a ProtoBuf `TransportHeader` including the target port
of the originating outbound connection and an (optional)
`SessionProtocol` describing the protocol used on that connection.

Currently, outbound proxies initiating direct connections will *always*
send `SessionProtocol` values communicating the protocol as understood
by the outbound proxy. However, this is not always the desired behavior.
Direct connections with `TransportHeader`s are used in two cases: for
gateway connections, and for ports which are marked as opaque. When the
inbound port is marked as opaque, the presence of a `SessionProtocol`
tells the inbound proxy to handle that connection as the indicated
protocol, which results in incorrect behavior when the inbound proxy's
ServerPolicy configures the target port as opaque (see #9888).

Therefore, the `Destination` proxy API has been updated to add a new
`ProtocolHint`, `Opaque`, which indicates that an outbound proxy should
_not_ send a `SessionProtocol` when initiating a direct connection, even
if the outbound proxy handled the connection as HTTP. This hint was
added to the proxy API in linkerd/linkerd2-proxy-api#197, and released
in `linkerd2-proxy-api` v0.8.0.

This branch updates the Destination controller's dependency on
`linkerd2-proxy-api` to v0.8.0, and changes the controller to send an
`Opaque` protocol hint when the target port is marked as opaque on the
destination pod. This should override the `H2` protocol hint that is
added when the destination is meshed. I've also added a new test for
this behavior.

Fixes #9888 (along with linkerd/linkerd2-proxy#2209, which changes the
proxy to actually handle the `Opaque` protocol hint).
2023-04-14 16:48:03 -07:00
Yu Cao e662e147ca
Support service internal traffic policy (#10186)
Closes #10130

https://kubernetes.io/docs/concepts/services-networking/service-traffic-policy/
1. Update endpoints watcher to include additional field `localTrafficPolicy`.
   Set to true when `.spec.internalTrafficPolicy` is set to `Local`
2. Update endpoints translater to filter by node when `localTrafficPolicy` is
   set to true. Topology Aware Hints are not used when
   `service.pec.internalTrafficPolicy` is set to local

Signed-off-by: Yu Cao <yc185050@ncr.com>
2023-02-06 13:53:07 -07:00
Alejandro Pedraza ed5dd35b57
Guard `endpointTranslator` with mutex (#9901)
Fixes #9896

The maps in `endpointTranslator` weren't being guarded against
concurrent access, so we're adding locks at the `Add` and `Remove`
methods. Also these functions ultimately call the `SendMsg` method on
the gRPC `stream`, which is not
["thread-safe"](https://github.com/grpc/grpc-go/blob/master/stream.go#L122-L126),
so we're guarding against other problems as well.

A new unit test `TestConcurrency` was added that failed in the following
ways before this fix:

When running the test with the `-race` flag, we immediately get the data race warning:

```bash
$ go test ./controller/api/destination/... -run TestConcurrency -race
time="2022-11-25T16:48:52-05:00" level=info msg="waiting for caches to sync"
time="2022-11-25T16:48:52-05:00" level=info msg="caches synced"
==================
WARNING: DATA RACE
Read at 0x00c0000c0040 by goroutine 161:
  github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80 +0x29c
  github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92

Previous write at 0x00c0000c0040 by goroutine 162:
  github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).sendFilteredUpdate()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:95 +0x66
  github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:83 +0x330
  github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x92

Goroutine 161 (running) created at:
  github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1439 +0x213
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1486 +0x47

Goroutine 162 (running) created at:
  github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency()
      /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1439 +0x213
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1486 +0x47
```

If run without the `-race` flag, we get the `concurrent map writes` panic reported in #9896:

```bash
$ go test ./controller/api/destination/... -run TestConcurrency -count=1
time="2022-11-25T16:53:25-05:00" level=info msg="waiting for caches to sync"
time="2022-11-25T16:53:25-05:00" level=info msg="caches synced"
fatal error: concurrent map writes

goroutine 187 [running]:
runtime.throw({0x1b57bc4?, 0x500000000000000?})
        /usr/local/go/src/runtime/panic.go:992 +0x71 fp=0xc00013dc80 sp=0xc00013dc50 pc=0x43a5b1
runtime.mapassign(0xc00013dec8?, 0x2?, 0x0?)
        /usr/local/go/src/runtime/map.go:595 +0x4d6 fp=0xc00013dd00 sp=0xc00013dc80 pc=0x4113b6
github.com/linkerd/linkerd2/controller/api/destination.(*endpointTranslator).Add(...)
        /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator.go:80
github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency.func1()
        /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:338 +0x1a8 fp=0xc00013dfe0 sp=0xc00013dd00 pc=0x16d1da8
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00013dfe8 sp=0xc00013dfe0 pc=0x46d721
created by github.com/linkerd/linkerd2/controller/api/destination.TestConcurrency
        /home/alpeb/pr/destination-panic/linkerd2/controller/api/destination/endpoint_translator_test.go:336 +0x3c
```
2022-11-28 15:07:18 -05:00
Alex Leong b7a0b8adb4
Bump minimum kubernetes version to 1.21 (#8647)
Fixes #8592

Increase the minimum supported kubernetes version from 1.20 to 1.21.  This allows us to drop support for batch/v1beta1/CronJob and discovery/v1beta1/EndpointSlices, instead using only v1 of those resources.  This fixes deprecation warnings about these warnings printed by the CLI.

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-06-14 15:15:28 -07:00
AdamKorcz 5610d6b6fa
Fuzzing: Move fuzzers upstream (#7419)
Move fuzzers from downstream into Linkerd

Signed-off-by: AdamKorcz <adam@adalogics.com>
Co-authored-by: Kevin Leimkuhler <kleimkuhler@icloud.com>
2022-05-05 13:01:00 -06:00
Oliver Gould fa8ddb4801
Use go-test/deep for comparisons in tests (#8427)
We frequently compare data structures--sometimes very large data
structures--that are difficult to compare visually. This change replaces
uses of `reflect.DeepEqual` with `deep.Equal`. `go-test`'s `deep.Equal`
returns a diff of values that are not equal.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-05-05 09:31:07 -07:00
Alex Leong 8692badc69
destination: Fix bug in filtering logic (#8169)
The destination controller can improperly handle updates by returning a
map reference instead of a new data structure. This breaks diffing logic,
as newly added endpoints appear to pre-exist.

This change ensures that a fresh data structure is used when handling
discovery updates.

Fixes #8143

Signed-off-by: Alex Leong <alex@buoyant.io>
2022-03-29 17:11:32 -07:00
Alejandro Pedraza 68b63269d9
Remove the `proxy.disableIdentity` config (#7729)
* Remove the `proxy.disableIdentity` config

Fixes #7724

Also:
- Removed the `linkerd.io/identity-mode` annotation.
- Removed the `config.linkerd.io/disable-identity` annotation.
- Removed the `linkerd.proxy.validation` template partial, which only
  made sense when `proxy.disableIdentity` was `true`.
- TestInjectManualParams now requires to hit the cluster to retrieve the
  trust root.
2022-01-31 10:17:10 -05:00
Matei David 690bc09c35
Stop using deprecated `beta.kubernetes.io/node` label (#7310)
In our chart values and (some) integration tests, we're using a deprecated
label for node selection. According to the warning messages we get during
installation, the label has been deprecated since k8s `v1.14`:

```
Warning: spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
Warning: spec.jobTemplate.spec.template.spec.nodeSelector[beta.kubernetes.io/os]: deprecated since v1.14; use "kubernetes.io/os" instead
```

This PR changes all occurrences of `beta.kubernetes.io/node` with
`kubernetes.io/node`.

Fixes #7225
2021-11-19 09:50:15 -08:00
Kevin Leimkuhler ebb1ee8c4c
Deprecate `topologyKeys` and add support for endpoint slices `Hints`. (#6698)
## background

In order to upgrade `client-go` and other related libraries to `v0.22.0`, we had to address the deprecation of the service's `TopologyKeys` field. This field and it's related feature have been deprecated and superseded by [Topology Aware Hints](https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/2433-topology-aware-hints/README.md).

The goal of topology aware hints is to to provide a simpler way for users to prefer endpoints by basing decisions soely off the node's `topology.kubernetes.io/zone` label. If a node is in `zone-a`, then it should prefer endpoints that _should_ be consumed by clients in `zone-a`.

kube-proxy (and now the destination controller) know that an endpoint _should_ be consumed by clients in certain zones if its `Hints.ForZones` field is set with a zone value that matches that of the client.

For example, the endpoint slice controller may add the following hint to an endpoint:
```
- addresses: ["1.1.1.1"]
  zone: "zone-a"
  hints:
    zone: "zone-b"
```

The above endpoint is an endpoint that is located in `zone-a` but should be consumed by clients in `zone-b`.

## changes

Now that topological preference is not a concept, we can remove it from the `servicePublisher` and `portPublisher` structs. The fields were only there so that it could be populated down to individual addresses.

The `Hints` field is only present on endpoints that belong to an `EndpointSlice`, so use of this field is limited to the `endpointSliceToAddresses` function.

When endpoint slices are translated to an `AddressSet` now, for each address (endpoint) we make sure to copy the `Hints.ForZones` field if it is present. This field is only present if it's set by the endpoint slice controller and it has [several safeguards](https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/#safeguards).

After `endpointSliceToAddresses` has translated an endpoint slice into an `AddressSet` and updated the endpoint translator's `availableEndpoints`, filtering takes place and is the crux of this change.

For each potential address that we have to consider in `availableEndpoints`, we make sure to only return a set of addresses who's consumption zone (zones in `forZones` field) match that of the node's zone. That way, we only communicate with endpoints that have been labeled by the endpoint slice controller for the current node we're on.

This allows us to remove the ordering/hierarchy of topological region and considering the `*` value.

## testing

I've added a unit test which creates an endpoint translator tied to a node in `west-1a` and asserts that it only handles updates for addresses that should be consumed by clients in `west-1a`.

Closes #6637

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-11-08 12:21:31 -07:00
Tarun Pothulapati 5c1a375a51
destination: pass opaque-ports through cmd flag (#5829)
* destination: pass opaque-ports through cmd flag

Fixes #5817

Currently, Default opaque ports are stored at two places i.e
`Values.yaml` and also at `opaqueports/defaults.go`. As these
ports are used only in destination, We can instead pass these
values as a cmd flag for destination component from Values.yaml
and remove defaultPorts in `defaults.go`.

This means that users if they override `Values.yaml`'s opauePorts
field, That change is propogated both for injection and also
discovery like expected.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-03-01 16:00:20 +05:30
Filip Petkovski 73f9fb3518
Use a shared informer when getting node topology (#5722)
Getting information about node topology queries the k8s api directly.
In an environment with high traffic and high number of pods, the
k8s api server can become overwhelmed or start throttling requests.

This MR introduces a node informer to resolve the bottleneck and
fetch node information asynchronously.

Fixes #5684

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
2021-02-12 11:05:38 -05:00
Tarun Pothulapati d0caaa86c4
Bump k8s client-go to v0.19.2 (#5002)
Fixes #4191 #4993

This bumps Kubernetes client-go to the latest v0.19.2 (We had to switch directly to 1.19 because of this issue). Bumping to v0.19.2 required upgrading to smi-sdk-go v0.4.1. This also depends on linkerd/stern#5

This consists of the following changes:

- Fix ./bin/update-codegen.sh by adding the template path to the gen commands, as it is needed after we moved to GOMOD.
- Bump all k8s related dependencies to v0.19.2
- Generate CRD types, client code using the latest k8s.io/code-generator
- Use context.Context as the first argument, in all code paths that touch the k8s client-go interface

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2020-09-28 12:45:18 -05:00
Matei David f797ab1e65
service topologies: topology-aware service routing (#4780)
[Link to RFC](https://github.com/linkerd/rfc/pull/23)

### What
---
* PR that puts together all past pieces of the puzzle to deliver topology-aware service routing, as specified in the [Kubernetes docs](https://kubernetes.io/docs/concepts/services-networking/service-topology/) but with a much better load balancing algorithm and all the coolness of linkerd :) 
* The first piece of this PR is focused on adding topology metadata: topology preference for services and topology `<k,v>` pairs for endpoints.
* The second piece of this PR puts together the new context format and fetching the source node topology metadata in order to allow for endpoints filtering.
* The final part is doing the filtering -- passing all of the metadata to the listener and on every `Add` filtering endpoints based on the topology preference of the service, topology `<k,v>` pairs of endpoints and topology of the source (again `<k,v>` pairs).

### How
---

* **Collecting metadata**:
   -  Services do not have values for topology keys -- the topological keys defined in a service's spec are only there to dictate locality preference for routing; as such, I decided to store them in an array, they will be taken exactly as they are found in the service spec, this ensures we respect the preference order.

   - For EndpointSlices, we are using a map -- an EndpointSlice has locality information in the form of `<k,v>` pair, where the key is a topological key (similar to what's listed in the service) and the value is the locality information -- e.g `hostname: minikube`. For each address we now have a map of topology values which gets populated when we translate the endpoints to an address set. Because normal Endpoints do not have any topology information, we create each address with an empty map which is subsequently populated ONLY for slices in the `endpointSliceToAddressSet` function.

* **Filtering endpoints**:
  - This was a tricky part and filled me with doubts. I think there are a few ways to do this, but this is how I "envisioned" it. First, the `endpoint_translator.go` should be the one to do the filtering; this means that on subscription, we need to feed all of the relevant metadata to the listener. To do this, I created a new function `AddTopologyFilter` as part of the listener interface.

  - To complement the `AddTopologyFilter` function, I created a new `TopologyFilter` struct in `endpoints_watcher.go`. I then embedded this structure in all listeners that implement the interface. The structure holds the source topology (source node), a boolean to tell if slices are activated in case we need to double check (or write tests for the function) and the service preference. We create the filter on Subscription -- we have access to the k8s client here as well as the service, so it's the best point to collect all of this data together. Addresses all have their own topology added to them so they do not have to be collected by the filter.

  - When we add a new set of addresses, we check to see if slices are enabled -- chances are if slices are enabled, service topology might be too. This lets us skip this step if the latest version is not adopted. Prior to sending an `Add` we filter the endpoints -- if the preference is registered by the filter we strictly enforce it, otherwise nothing changes.

And that's pretty much it. 

Signed-off-by: Matei David <matei.david.35@gmail.com>
2020-08-18 11:11:09 -07:00
Josh Soref 72aadb540f
Spelling (#4872)
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).

The misspellings have been reported at aaf440489e (commitcomment-41423663)

The action reports that the changes in this PR would make it happy: 5b82c6c5ca

Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-08-12 21:59:50 -07:00
Oliver Gould c4d649e25d
Update proxy-api version to v0.1.13 (#4614)
This update includes no API changes, but updates grpc-go
to the latest release.
2020-06-24 12:52:59 -07:00
Zahari Dichev 10ecd8889e
Set auth override (#4160)
Set AuthOverride when present on endpoints annotation

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-25 10:56:36 +02:00
Zahari Dichev caf4e61daf
Enable identitiy on endpoints not associated with pods (#4134)
Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
2020-03-09 20:55:57 +02:00
Alejandro Pedraza afb93cddc8
Use `t.Name()` instead of `t.Name` in tests (#3970)
Use `t.Name()` instead of `t.Name` when retrieving the name of tests.
This was causing an error to be added in the log:
```
output: logrus_error="can not add field \"test\"
```

Followup to
[comment](https://github.com/linkerd/linkerd2/pull/3965#discussion_r370387990)
2020-01-27 09:17:19 -05:00
Alex Leong 03762cc526
Support pod ip and service cluster ip lookups in the destination service (#3595)
Fixes #3444 
Fixes #3443 

## Background and Behavior

This change adds support for the destination service to resolve Get requests which contain a service clusterIP or pod ip as the `Path` parameter.  It returns the stream of endpoints, just as if `Get` had been called with the service's authority.  This lays the groundwork for allowing the proxy to TLS TCP connections by allowing the proxy to do destination lookups for the SO_ORIG_DST of tcp connections.  When that ip address corresponds to a service cluster ip or pod ip, the destination service will return the endpoints stream, including the pod metadata required to establish identity.

Prior to this change, attempting to look up an ip address in the destination service would result in a `InvalidArgument` error.

Updating the `GetProfile` method to support ip address lookups is out of scope and attempts to look up an ip address with the `GetProfile` method will result in `InvalidArgument`.

## Implementation

We do this by creating a `IPWatcher` which wraps the `EndpointsWatcher` and supports lookups by ip.   `IPWatcher` maintains a mapping up clusterIPs to service ids and translates subscriptions to an IP address into a subscription to the service id using the underlying `EndpointsWatcher`.

Since the service name is no longer always infer-able directly from the input parameters, we restructure `EndpointTranslator` and `PodSet` so that we propagate the service name from the endpoints API response.

## Testing

This can be tested by running the destination service locally, using the current kube context to connect to a Kubernetes cluster:

```
go run controller/cmd/main.go destination -kubeconfig ~/.kube/config
```

Then lookups can be issued using the destination client:

```
go run controller/script/destination-client/main.go -path 192.168.54.78:80 -method get -addr localhost:8086
```

Service cluster ips and pod ips can be used as the `path` argument.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-12-19 09:25:12 -08:00
Alex Leong e538a05ce2
Add support for stateful sets (#3113)
We add support for looking up individual pods in a stateful set with the destination service.  This allows Linkerd to correctly proxy requests which address individual pods.  The authority structure for such a request is `<pod-name>.<service>.<namespace>.svc.cluster.local:<port>`.

Fixes #2266 

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-07-24 14:09:46 -07:00
arminbuerkle 010efac24b Allow custom cluster domain in controller components (#2950)
* Allow custom cluster domain in destination watcher

The change relaxes the constrains of an authority requiring a
`svc.cluster.local` suffix to only require `svc` as third part.

A unit test could be added though the destination/server and endpoint
watcher already test this behaviour.

* Update proto to allow setting custom cluster domain

Update golden templates

* Allow setting custom domain in grpc, web server

* Remove cluster domain flags from web srv and public api

* Set defaultClusterDomain in validateAndBuild if none is set

Signed-off-by: Armin Buerkle <armin.buerkle@alfatraining.de>
2019-07-23 08:59:41 -07:00
Alex Leong 06a69f69c5
Refactor destination service (#2786)
This is a major refactor of the destination service.  The goals of this refactor are to simplify the code for improved maintainability.  In particular:

* Remove the "resolver" interfaces.  These were a holdover from when our decision tree was more complex about how to handle different kinds of authorities.  The current implementation only accepts fully qualified kubernetes service names and thus this was an unnecessary level of indirection.
* Moved the endpoints and profile watchers into their own package for a more clear separation of concerns.  These watchers deal only in Kubernetes primitives and are agnostic to how they are used.  This allows a cleaner layering when we use them from our gRPC service.
* Renamed the "listener" types to "translator" to make it more clear that the function of these structs is to translate kubernetes updates from the watcher to gRPC messages.

Signed-off-by: Alex Leong <alex@buoyant.io>
2019-06-04 15:01:16 -07:00