Commit Graph

12 Commits

Author SHA1 Message Date
Alex Leong 1e605ddf63
Add informer lag histograms (#11534)
In order to detect if the destination controller's k8s informers have fallen behind, we add a histogram for each resource type.  These histograms track the delta between when an update to a resource occurs and when the destination controller processes that update.  We do this by looking at the timestamps on the managed fields of the resource and looking for the most recent update and comparing that to the current time.

The histogram metrics are of the form `{kind}_informer_lag_ms_bucket_*`.

* We record a value only for updates, not for adds or deletes.  This is because when the controller starts up, it will populate its cache with an add for each resource in the cluster and the delta between the last updated time of that resource and the current time may be large.  This does not represent informer lag and should not be counted as such.
* When the informer performs resyncs, we get updates where the updated time of the old version is equal to the updated time of the new version.  This does not represent an actual update of the resource itself and so we do not record a value.
* Since we are comparing timestamps set on the manged fields of resources to the current time from the destination controller's system clock, the accuracy of these metrics depends on clock drift being minimal across the cluster.
* We use histogram buckets which range from 500ms to about 17 minutes.  In my testing, an informer lag of 500ms-1000ms is typical.  However, we wish to have enough buckets to identify cases where the informer is lagged significantly behind.

Signed-off-by: Alex Leong <alex@buoyant.io>
2023-11-08 14:56:20 -05:00
Alejandro Pedraza 147c8dc07c
Add metrics to server and service watchers (#10213)
* Add metrics to server and service watchers

Closes #10202 and completes #2204

As a followup to #10201, I'm adding the following metric in `server_watcher.go`:

- `server_port_subscribers`: This tracks the number of subscribers to changes to Servers associated to a port in a pod. The metric's label identify the namespace and name of the pod, and its targeted port.

Additionally, `opaque_ports.go` was missing metrics as well. I added `service_subscribers` which tracks the number of subscribers to a given Service, labeled by the Service's namespace and name.

`opaque_ports.go` was also leaking the subscriber's map key, so that got fixed as well.
2023-02-07 08:51:09 -05:00
dependabot[bot] 62d6d7cd52
build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0 (#10038)
* build(deps): bump sigs.k8s.io/gateway-api from 0.5.1 to 0.6.0

Bumps [sigs.k8s.io/gateway-api](https://github.com/kubernetes-sigs/gateway-api) from 0.5.1 to 0.6.0.
- [Release notes](https://github.com/kubernetes-sigs/gateway-api/releases)
- [Changelog](https://github.com/kubernetes-sigs/gateway-api/blob/main/CHANGELOG.md)
- [Commits](https://github.com/kubernetes-sigs/gateway-api/compare/v0.5.1...v0.6.0)

---
updated-dependencies:
- dependency-name: sigs.k8s.io/gateway-api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Account for possible errors returned from `AddEventHandler`

In v0.26.0 client-go's `AddEventHandler` method for informers started
returning a registration handle (that we ignore) and an error that we
now surface up.

* client-go v0.26.0 removed the openstack plugin

* Temporary changes to trigger tests in k8s 1.21

- Adds an innocuous change to integration.yml so that all tests get
  triggered
- Hard-code k8s version in `k3d cluster create` invocation to v1.21

* Revert "Temporary changes to trigger tests in k8s 1.21"

This reverts commit 3e1fdd0e5e.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2023-01-16 09:38:09 -05:00
Oliver Gould b2f22dee78
go: Copy port range utilities from the proxy-init repo (#9143)
The proxy-init repo is changing its structure and, as such, we want to
minimize cross-repo dependencies from linkerd2 to linkerd2-proxy-init.
(We expect the cni-plugin code to move in a followup change).

This change duplicates the port range parsing utility (about 50 lines,
plus tests). This avoids stray dependencies on linkerd2-proxy-init.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2022-08-12 10:34:02 -07:00
Jacob Henner 7d47639608
Remove kube-system exclusions from watchers (#8720)
Watch events for objects in the kube-system namespace were previously ignored.
In certain situations, this would cause the destination service to return
invalid (outdated) endpoints for services in kube-system - including unmeshed
services.

It [was suggested][1] that kube-system events were ignored to avoid handling
frequent Endpoint updates - specifically from [controllers using Endpoints for
leader elections][2]. As of Kubernetes 1.20, these controllers [default to using
Leases instead of Endpoints for their leader elections][3], obviating the need
to exclude (or filter) updates from kube-system. The exclusions have been
removed accordingly.

[1]: https://github.com/linkerd/linkerd2/pull/4133#issuecomment-594983588
[2]: https://github.com/kubernetes/kubernetes/issues/86286
[3]: https://github.com/kubernetes/kubernetes/pull/94603

Signed-off-by: Jacob Henner <code@ventricle.us>
2022-07-11 13:52:27 -06:00
Krzysztof Dryś f92e77f7f0
Remove legacy upgrade and it's references (#7309)
With [linkerd2#5008](https://github.com/linkerd/linkerd2/issues/5008) and associated PRs, we changed the way configuration is handled by storing a helm values struct inside of the configmap.

Now that we have had one stable release with new configuration, were no longer use and need to maintain the legacy config. This commit removes all the associated logic, protobuf files, and references.

Changes Include:

- Removed [`proto/config/config.proto`](https://github.com/linkerd/linkerd2/blob/main/proto/config/config.proto)
- Changed [`bin/protoc-go.sh`](https://github.com/linkerd/linkerd2/blob/main/bin/protoc-go.sh) to not include `config.proto`
- Changed [`FetchLinkerdConfigMap()`](741fde679b/pkg/healthcheck/healthcheck.go (L1768)) in `healthcheck.go` to return only the configmap, with the pb type.
- Changed [`FetchCurrentConfiguration()`](741fde679b/pkg/healthcheck/healthcheck.go (L1647)) only unmarshal and use helm value struct from configmap (as a follow-up to the todo above; note that there's already a todo here to refactor the function once value struct is the default, which has already happened)
- Removed [`upgrade_legacy.go`](https://github.com/linkerd/linkerd2/blob/main/cli/cmd/upgrade_legacy.go)

Signed-off-by: Krzysztof Dryś <krzysztofdrys@gmail.com>
2021-11-29 20:08:58 +05:30
Kevin Leimkuhler 01cbe616f1
Honor Server `proxyProtocol` in destination service `Get` with policy CRD APIs (#7184)
This change ensures that if a Server exists with `proxyProtocol: opaque` that selects an endpoint backed by a pod, that destination requests for that pod reflect the fact that it handles opaque traffic.

Currently, the only way that opaque traffic is honored in the destination service is if the pod has the `config.linkerd.io/opaque-ports` annotation. With the introduction of Servers though, users can set `server.Spec.ProxyProtocol: opaque` to indicate that if a Server selects a pod, then traffic to that pod's `server.Spec.Port` should be opaque. Currently, the destination service does not take this into account.

There is an existing change up that _also_ adds this functionality; it takes a different approach by creating a policy server client for each endpoint that a destination has. For `Get` requests on a service, the number of clients scales with the number of endpoints that back that service.

This change fixes that issue by instead creating a Server watch in the endpoint watcher and sending updates through to the endpoint translator.

The two primary scenarios to consider are

### A `Get` request for some service is streaming when a Server is created/updated/deleted
When a Server is created or updated, the endpoint watcher iterates through its endpoint watches (`servicePublisher` -> `portPublisher`) and if it selects any of those endpoints, the port publisher sends an update if the Server has marked that port as opaque.

When a Server is deleted, the endpoint watcher once again iterates through its endpoint watches and deletes the address set's `OpaquePodPorts` field—ensuring that updates have been cleared of Server overrides.

### A `Get` request for some service happens after a Server is created
When a `Get` request occurs (or new endpoints are added—they both take the same path), we must check if any of those endpoints are selected by some existing Server. If so, we have to take that into account when creating the address set.

This part of the change gives me a little concern as we first must get all the Servers on the cluster and then create a set of _all_ the pod-backed endpoints that they select in order to determine if any of these _new_ endpoints are selected.

## Testing
Right now this can be tested by starting up the destination service locally and running `Get` requests on a service that has endpoints selected by a Server

**app.yaml**
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod
  labels:
    app: pod
spec:
  containers:
  - name: app
    image: nginx
    ports:
    - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: svc
spec:
  selector:
    app: pod
  ports:
  - name: http
    port: 80
---
apiVersion: policy.linkerd.io/v1alpha1
kind: Server
metadata:
  name: srv
  labels:
    policy: srv
spec:
  podSelector:
    matchLabels:
      app: pod
  port: 80
  proxyProtocol: HTTP/1
```

```bash
$ go run controller/script/destination-client/main.go -path svc.default.svc.cluster.local:80
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-11-23 20:35:53 -07:00
Bart Peeters 6bf507ad32
Change log level for watchers in destination svc
Make destination info logs clearer by changing log level of watchers log messages 'Establishing watch', 'Starting watch' and 'Stopping watch' from info to debug (#6917)

Signed-off-by: Bart Peeters <birtpeeters@hotmail.com>
2021-09-20 09:44:32 +01:00
Josh Soref 0be792fadc
Spelling (#6215)
This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling).

The misspellings have been reported at 0d56327e6f (commitcomment-51603624)

The action reports that the changes in this PR would make it happy: 03a9c310aa

Note: this PR does not include the action. If you're interested in running a spell check on every PR and push, that can be offered separately.

Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2021-06-07 15:16:59 -06:00
Tarun Pothulapati 5c1a375a51
destination: pass opaque-ports through cmd flag (#5829)
* destination: pass opaque-ports through cmd flag

Fixes #5817

Currently, Default opaque ports are stored at two places i.e
`Values.yaml` and also at `opaqueports/defaults.go`. As these
ports are used only in destination, We can instead pass these
values as a cmd flag for destination component from Values.yaml
and remove defaultPorts in `defaults.go`.

This means that users if they override `Values.yaml`'s opauePorts
field, That change is propogated both for injection and also
discovery like expected.

Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
2021-03-01 16:00:20 +05:30
Kevin Leimkuhler 51a965e228
Return default opaque ports in the destination service (#5814)
This changes the destination service to always use a default set of opaque ports
for pods and services. This is so that after Linkerd is installed onto a
cluster, users can benefit from common opaque ports without having to annotate
the workloads that serve the applications.

After #5810 merges, the proxy containers will be have the default opaque ports
`25,443,587,3306,5432,11211`. This value on the proxy container does not affect
traffic though; it only configures the proxy.

In order for clients and servers to detect opaque protocols and determine opaque
transports, the pods and services need to have these annotations.

The ports `25,443,587,3306,5432,11211` are now handled opaquely when a pod or
service does not have the opaque ports annotation. If the annotation is present
with a different value, this is used instead of the default. If the annotation
is present but is an empty string, there are no opaque ports for the workload.

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-24 14:55:31 -05:00
Kevin Leimkuhler ff93d2d317
Mirror opaque port annotations on services (#5770)
This change introduces an opaque ports annotation watcher that will send
destination profile updates when a service has its opaque ports annotation
change.

The user facing change introduced by this is that the opaque ports annotation is
now required on services when using the multicluster extension. This is because
the service mirror will create mirrored services in the source cluster, and
destination lookups in the source cluster need to discover that the workloads in
the target cluster are opaque protocols.

### Why

Closes #5650

### How

The destination server now has a new opaque ports annotation watcher. When a
client subscribes to updates for a service name or cluster IP, the `GetProfile`
method creates a profile translator stack that passes updates through resource
adaptors such as: traffic split adaptor, service profile adaptor, and now opaque
ports adaptor.

When the annotation on a service changes, the update is passed through to the
client where the `opaque_protocol` field will either be set to true or false.

A few scenarios to consider are:

  - If the annotation is removed from the service, the client should receive
    an update with no opaque ports set.
  - If the service is deleted, the stream stays open so the client should
    receive an update with no opaque ports set.
  - If the service has the annotation added, the client should receive that
    update.

### Testing

Unit test have been added to the watcher as well as the destination server.

An integration test has been added that tests the opaque port annotation on a
service.

For manual testing, using the destination server scripts is easiest:

```
# install Linkerd

# start the destination server
$ go run controller/cmd/main.go destination -kubeconfig ~/.kube/config

# Create a service or namespace with the annotation and inject it

# get the destination profile for that service and observe the opaque protocol field
$ go run controller/script/destination-client/main.go -method getProfile -path test-svc.default.svc.cluster.local:8080
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]                                              
INFO[0000] fully_qualified_name:"terminus-svc.default.svc.cluster.local" opaque_protocol:true retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} dst_overrides:{authority:"terminus-svc.default.svc.cluster.local.:8080" weight:10000} 
INFO[0000]
```

Signed-off-by: Kevin Leimkuhler <kevin@kleimkuhler.com>
2021-02-23 13:36:17 -05:00