Commit Graph

879 Commits

Author SHA1 Message Date
Alex Leong 396af7c946
refactor(multicluster): Replace use of unstructured API with typed bindings for Link CR (#13420)
The linkerd-multicluster extension uses client-go's `unstructured` API to access Link custom resources.  This API allowed us to develop quickly without the work of generating typed bindings.  However, using the unstrucutred API is error prone since fields must be accessed by their string name.  It is also inconsistent with the rest of the project which uses typed bindings.

We replace the use of the unstructured API for Link resources with generated typed bindings.  The client-go APIs are slightly different and client-go does not provide a way to update subresources for typed bindings.  Therefore, when updating a Link's status subresource, we use a patch instead of an update.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-12-10 11:44:19 -08:00
Oliver Gould 17b2692d58
build(deps): bump linkerd/dev from v43 to v44 (#13428)
* docker.io/library/golang from 1.22 to 1.23
* gotestsum from 0.4.2 to 1.12.0
* protoc-gen-go from 1.28.1 to 1.35.2
* protoc-gen-go-grpc from 1.2 to 1.5.1
* docker.io/library/rust from 1.76.0 to 1.83.0
* cargo-deny from 0.14.11 to 0.16.3
* cargo-nextest from 0.9.67 to 0.9.85
* cargo-tarpaulin from 0.27.3 to 0.31.3
* just from 1.24.0 to 1.37.0
* yq from 4.33.3 to 4.44.5
* markdownlint-cli2 from 0.10.0 to 0.15.0
* shellcheck from 0.9.0 to 0.10.0
* actionlint from 1.6.26 to 1.7.4
* protoc from 3.20.3 to 29.0
* step from 0.25.2 to 0.28.2
* kubectl from 1.29.2 to 1.31.3
* k3d from 5.6.0 to 5.7.5
* k3s image shas
* helm from 3.14.1 to 3.16.3
* helm-docs from 1.12.0 to 1.14.2
2024-12-06 11:38:36 -08:00
Oliver Gould 3c91fc64ce
fix(destination): avoid panic on missing managed fields timestamp (#13378)
We received a report of a panic:

    runtime error: invalid memory address or nil pointer dereference

    panic({0x1edb860?, 0x37a6050?}
        /usr/local/go/src/runtime/panic.go:785 +0x132

    github.com/linkerd/linkerd2/controller/api/destination/watcher.latestUpdated({0xc0006b2d80?, 0xc00051a540?, 0xc0008fa008?})
        /linkerd-build/vendor/github.com/linkerd/linkerd2/controller/api/destination/watcher/endpoints_watcher.go:1612 +0x125

    github.com/linkerd/linkerd2/controller/api/destination/watcher.(*OpaquePortsWatcher).updateService(0xc0007d5480, {0x21fd160?, 0xc000d71688?}, {0x21fd160, 0xc000d71688})
        /linkerd-build/vendor/github.com/linkerd/linkerd2/controller/api/destination/watcher/opaque_ports_watcher.go:141 +0x68

The `latestUpdated` function does not properly handle the case where a atime is
omitted from a `ManagedFieldsEntry`.

    type ManagedFieldsEntry struct {
        // Time is the timestamp of when the ManagedFields entry was added. The
        // timestamp will also be updated if a field is added, the manager
        // changes any of the owned fields value or removes a field. The
        // timestamp does not update when a field is removed from the entry
        // because another manager took it over.
        // +optional
        Time *Time `json:"time,omitempty" protobuf:"bytes,4,opt,name=time"`

This change adds a check to avoid the nil dereference.
2024-11-22 15:21:09 -08:00
Derek Brown 80e444edbd
lint: fix docker build warnings (#13351)
Docker builds emit a warning because the case of 'FROM' and 'as' don't match. Fix this everywhere.

Signed-off-by: Derek Brown <6845676+DerekTBrown@users.noreply.github.com>
2024-11-20 08:44:45 -05:00
Alex Leong 752d1c9ea0
Add tests for federated service watcher (#13329)
Adds tests for the federated service watcher that exercise having remote and local clusters join and leave a federated service and ensuring that the correct proxy API updates are emitted.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-11-19 10:08:50 -08:00
MicahSee 264b67c5fe
Ensure consistent JSON logging for proxy-injector container (#13335)
Ensure consistent JSON logging for proxy-injector container

When JSON logging is enabled in the proxy-injector `controllerLogFormat: json` some log messages adhere to the JSON format while others do not. This inconsistency creates difficulty in parsing logs, especially in automated workflows. For example:

```
{"level":"info","msg":"received admission review request \"83a0ce4d-ab81-42c9-abe4-e0ade0f926e2\"","time":"2024-10-10T21:06:18Z"}
time="2024-10-10T21:06:18Z" level=info msg="received pod/mypod"
```

Modified the logging implementation in the `controller/proxy-injector/webhook.go` to ensure all log messages follow the JSON format consistently. This was achieved by removing a new instance of the logrus logger that was being created in the file and replacing it with the global logger instance, ensuring all logs respect the controllerLogFormat configuration.

Reproduced the issue by enabling JSON logging and observing mixed-format logs when install the emojivoto sample application. Applied the changes and verified that all logs consistently use the JSON format.
Ran the linkerd check command and confirmed there are no additional warnings or issues.
Tested various scenarios, including pods with and without the injection annotation, to ensure consistent logging behavior.
Fixes #13168

Signed-off-by: Micah See msee@usc.edu
2024-11-18 17:15:24 +00:00
Oliver Gould 5cbe45c86e
feat(destination): set parent and profile references (#13292)
In order for proxies to properly reflect the resources used to drive policy
decisions, the proxy API has been updated to include resource metadata on
ServiceProfile responses.

This change updates the profile translator to include ParentRef and ProfileRef
metadata when it is configured.

This change does not set backend or endpoint references.
2024-11-09 00:11:40 +00:00
Alex Leong 50b6a17e68
Add support for federated services to the service mirror controller (#13269)
When the service mirror controller detects a service in the remote cluster which matches the federated service selector (`mirror.linkerd.io/federated=memeber` by default), it will add that service to the federated service in the local cluster named `<svc>-federated`, creating this service if it does not already exist.  To join a service to a federated service, it is added to the `multicluster.linkerd.io/remote-discovery` annotation on the federated service which contains a comma separated list of values in the form `<svc>@<cluster>`.  When a remote service no longer exists or matches the federated service selector, it is removed from the federated service by removing it from the `mutlicluster.linkerd.io/remote-discovery` annotation.

We also add a new `local-service-mirror` deployment to the Linkerd-multicluster extension which watches the local cluster for any services which match the federated service selector.  Any services in the local cluster which match will be added to the federated service by setting the `mutlicluster.linkerd.io/local-discovery` annotation on the federated service to the local service name.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-11-08 09:34:29 -08:00
Alex Leong c66f83e1f1
Add federated service watcher (#13267)
We add support for federated services to the destination controller by adding a new FederatedServiceWatcher.  When the destination controller receives a `Get` request for a Service with the `multicluster.linkerd.io/remote-discovery` and/or the `multicluster.linkerd.io/local-discovery` annotations, it subscribes to the FederatedServiceWatcher instead of subscribing to the EndpointsWatcher directly.  The FederatedServiceWatcher watches the federated service for any changes to these annotations, and maintains the appropriate watches on the local EndpointWatcher and/or remote EndpointWatchers fetched through the ClusterStore.

This means that we will often have multiple EndpointTranslators writing to the same `Get` response stream.  In order for a `NoEndpoints` message sent to one EndpointTranslator to not clobber the whole stream, we make a change where `NoEndpoints` messages are no longer sent to the response stream, but are replaced by a `Remove` message containing all of the addresses from that EndpointTranslator.  This allows multiple EndpointTranslators to coexist on the same stream.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-11-08 09:34:01 -08:00
Alex Leong bcc563812a
Update generated client-go code (#13167)
Our generated client-go code committed in the repo has diverged from the code generated by the codegen tools.

We bring them back in sync by running bin/updated-codegen.sh. This should be a non-functional and non-breaking change.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-10-22 17:08:43 -07:00
Scott Fleener 958cfca666
Export zone locality in outbound destination metrics (#13129)
Currently, we don't have a simple way of checking if the endpoint a proxy is discovering is in the same zone or not.

This adds a "zone_locality" metric label to the outbound destination address metrics. Note that this does not increase the cardinality of the related metrics, as this label doesn't vary within an endpoint.

Validated by checking the prometheus metrics on a local cluster and verifying this label appears in the outbound transport metrics.

Signed-off-by: Scott Fleener <scott@buoyant.io>
2024-10-15 13:43:05 -07:00
Vadim Makerov 005a3a470e
Implement providing configuration for TCP_USER_TIMEOUT to linkerd-proxy (#13024)
Default values for 30s will be enough to linux TCP-stack completes about 7 packages retransmissions, after about 7 retransmissions RTO (retransmission timeout) will rapidly grows and does not make much sense to wait for too long.

Setting TCP_USER_TIMEOUT between linkerd-proxy and wild world is enough, since connections to containers in same pod are more stable and reliable

Fixes #13023

Signed-off-by: UsingCoding <extendedmoment@outlook.com>
2024-10-03 16:08:01 +00:00
Maxime Brunet 260fd19a2a
build: add image source label to all Dockerfiles (#13042)
This change adds the org.opencontainers.image.source label to all Dockerfiles.

It allows tools to find the source repository for the produced images.

Signed-off-by: Maxime Brunet <max@brnt.mx>
2024-09-10 11:25:32 -07:00
dependabot[bot] 4baa94baac
build(deps): bump k8s.io/client-go from 0.30.3 to 0.31.0 (#12958)
* build(deps): bump k8s.io/client-go from 0.30.3 to 0.31.0

Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.30.3 to 0.31.0.
- [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md)
- [Commits](https://github.com/kubernetes/client-go/compare/v0.30.3...v0.31.0)

---
updated-dependencies:
- dependency-name: k8s.io/client-go
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* To apease the linter, replaced deprecated workqueue interfaces with their typed alternatives. For the endpoints controller we can instantiate with . But for the service mirror, given the queue can hold different event types, we have to instantiate with .

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-09-04 09:04:04 -05:00
Alejandro Pedraza 567288a060
Dual-stack support for ExternalWorkloads (#12965)
* Dual-stack support for ExternalWorkloads

This changes the `workloadIPs.maxItems` field in the ExternalWorkload CRD from `1` to `2`, to accommodate for an IPv4 and IPv6 pair. This is a BC change, so there's no need to bump the CRD version.

The control plane already supports this, so this change is mainly about expansions to the unit tests to also account for the double stack case.
2024-08-30 13:23:56 -05:00
Alejandro Pedraza 332c4efa8c
Only bind to IPv6 addresses when disableIPv6=false (#12938)
## Problem

When the IPv6 stack in Linux is disabled, the proxy will crash at startup.

## Repro

In a Linux machine, disable IPv6 networking through the `net.ipv6.conf.*` sysctl kernel tunables, and restart the system:

- In /etc/sysctl.conf add:
```
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
```

- In /etc/default/grub set:
```
GRUB_CMDLINE_LINUX="ipv6.disable=1"
```

- Don't forget to update grub before rebooting:
```
sudo update-grub
```

In a default k3d cluster, install Linkerd. You should see the following error in any proxy log:

```
thread 'main' panicked at /__w/linkerd2-proxy/linkerd2-proxy/linkerd/app/src/lib.rs:245:14:
Failed to bind inbound listener: Os { code: 97, kind: Uncategorized, message: "Address family not supported by protocol" }
```

## Cause

Even if a k8s cluster didn't support IPv6, we were counting on the nodes having an IPv6 stack, which allowed us to bind to the inbound proxy to [::] (although not to [::1] for the outbound proxy, as seen in GKE). This was the case in the major cloud providers we tested, but it turns out there are folks running nodes with IPv6 disabled and so we have to cater that case as well.

## Fix

The current change undoes some of the changes from 7cbe2f5ca6 (for the proxy config), 7cbe2f5ca6 (for the policy controller) and 66034099d9 (for linkerd-cni), binding back again to 0.0.0.0 unless `disableIPv6` is false.
2024-08-05 13:29:55 -05:00
Alex Leong 59fbf7b01d
feat(proxy): Disable all header and request logging (#12903)
The Linkerd proxy suppresses all logging of HTTP headers at debug level or higher unless the `proxy.logHTTPHeaders` helm values is set to `insecure`.  However, even when this value is not set, HTTP headers can still be logged if the log level is set to trace.

We update the log string we use to disable logging of HTTP headers from `linkerd_proxy_http::client[{headers}]=off` to the more general `[{headers}]=off,[{request}]=off`.  This will disable any logging which includes a `headers` or `request` field.  This has the effect of disabling the logging of headers at trace level as well.  As before, these logs can be re-enabled by settings `proxy.logHTTPHeaders=insecure`.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-07-31 11:31:25 -07:00
Alejandro Pedraza 71291fe7bc
Add `accessPolicy` field to Server CRD (#12845)
Followup to #12844

This new field defines the default policy for Servers, i.e. if a request doesn't match the policy associated to a Server then this policy applies. The values are the same as for `proxy.defaultInboundPolicy` and the `config.linkerd.io/default-inbound-policy` annotation (all-unauthenticated, all-authenticated, cluster-authenticated, cluster-unauthenticated, deny), plus a new value "audit". The default is "deny", thus remaining backwards-compatible.

This field is also exposed as an additional printer column.
2024-07-22 09:01:09 -05:00
Matei David f05d1e9e26
feat(helm): default proxy-init resource requests to proxy values (#12741)
Default values for `linkerd-init` (resources allocated) are not always
the right fit. We offer default values to ensure proxy-init does not get
in the way of QOS Guaranteed (`linkerd-init` resource limits and
requests cannot be configured in any other way).

Instead of using default values that can be overridden, we can re-use
the proxy's configuration values. For the pod to be QOS Guaranteed, the
values for the proxy have to be set any way. If we re-use the same
values for proxy-init we can ensure we'll always request the same amount
of CPU and memory as needed.

* `linkerd-init` now defaults to the proxy's values
* when the proxy has an annotation configuration for resource requests,
  it also impacts `linkerd-init`
* Helm chart and docs have been updated to reflect the missing values.
* tests now no longer use `ProxyInit.Resources`

UPGRADE NOTE:
- Deprecates `proxyInit.resources` field in the Helm values.
  - It will be a no-op if specified (no hard failures)

Closes #11320

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-06-24 12:37:47 +01:00
Alex Leong 35fb2d6d11
feat!: Add config to disable proxy /shutdown admin endpoint (#12705)
The proxy may expose a /shutdown HTTP endpoint on its admin server that may be used by `linkerd-await --shutdown` to trigger proxy shutdown after a process completes. If an application has an SSRF vulnerability, however, an attacker could use this endpoint to trigger proxy shutdown, causing a denial of service. This admin endpoint is only useful with linkerd-await; and this functionality is supplanted by Kubernetes Native Sidecars.

To address this potential issue, this change disables the proxy's admin endpoint by default. A helm value is introduced to support enabling the admin endpoint cluster-wide; and the `config.linkerd.io/proxy-admin-shutdown: enabled` annotation may be set to enable it the admin endpoint on an individual workload.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-14 09:55:15 -07:00
Alejandro Pedraza 80a1803c7f
Properly set log level for hickory dependency in proxy (#12722)
Followup to linkerd/linkerd2-proxy#2872 , where we swapped the
trust-dns-resolver with the hickory-resolver dependency. This change
updates the default log level setting for the proxy to account for
that.
2024-06-14 08:32:12 -07:00
Alejandro Pedraza b59149388f
Bump proxy-init to v2.4.1 and cni-plugin to v1.5.1 (#12711)
Those releases ensure that when IPv6 is enabled, the series of ip6tables commands succeed. If they fail, the proxy-init/linkerd-cni containers should fail as well, instead of ignoring errors.

See linkerd/linkerd2-proxy-init#388
2024-06-13 17:15:41 -05:00
Alex Leong e0fe0248d5
Add config to disable HTTP proxy logging (#12665)
Fixes #12620

When the Linkerd proxy log level is set to `debug` or higher, the proxy logs HTTP headers which may contain sensitive information.

While we want to avoid logging sensitive data by default, logging of HTTP headers can be a helpful debugging tool.  Therefore, we add a `proxy.logHTTPHeaders` Helm value which prevents the logging of HTTP headers when set to false.  The default value of this value is false so that headers cannot be logged unless users opt-in.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-11 17:46:54 -07:00
Alex Leong 3bd01cac9c
add nil check when reading endpoint hostname (thanks @acallejaszu) (#12689)
Fixes #12686

When an endpoint in an EndpointSlice resource does not contain a hostname field, the destination controller can panic while looking for an endpoint with a certain hostname.  This happens when doing a lookup with a pod dns name.

We add a nil check to avoid the panic.

We add such an endpoint to our test fixture to exercise this case.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-06-10 10:45:31 -07:00
Nico Feulner 3d674599b3
make group ID configurable (#11924)
Fixes #11773

Make the proxy's GUID configurable via `proxy.gid` which defaults to `-1`, in which case the GUID is not set.
Also added ability to set the GUID for proxy-init and the core and extension controllers.

---------

Signed-off-by: Nico Feulner <nico.feulner@gmail.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-05-23 15:54:21 -05:00
dependabot[bot] d42432914d
build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0 (#12593)
* build(deps): bump google.golang.org/grpc from 1.63.2 to 1.64.0

Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.63.2 to 1.64.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.63.2...v1.64.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

I've replaced all the `grpc.Dial` calls with `grpc.NewClient`. There was one call `grpc.DialContext(ctx,...)` in `viz/tap/api/grpc_server.go` that also got replaced with `grpc.NewClient`, which loses the ability to pass `ctxt` but that doesn't matter; as we're not using `WithBlock(true)` that context wasn't being accounted for when we were using `DialContext()` anyways.

https://github.com/grpc/grpc-go/blob/v1.64.0/clientconn.go#L216-L242

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io>
2024-05-22 14:40:04 -05:00
Alejandro Pedraza df24ea6d96
Refactor ES addition logic in Destination (#12625)
This is a second take on #12427, which avoided a theoretical/correctness
issue around overwritting new ES addresses with stale data.

We had to revert that in #12589 because the change introduced a bug, by
returning early when the ES had no addresses and failed to properly
initiallize `addesses` for the portPublisher.

This just removes the early return.
2024-05-22 09:12:19 -05:00
Matei David 407df01ec3
chore(controller): Remove stream concurrency limits (#12598)
Our gRPC servers use the default gRPC server configuration, which
limits the number of concurrent streams to 100. Since the controllers
run with proxies, this provides a hard scaling limit for the number of
watches an application can have.

This change updates our gRPC server configuration to clear the default
concurrency limit, allowing the server to handle as many streams as
possible.

Signed-off-by: Matei David <matei@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2024-05-15 18:07:15 +01:00
Alejandro Pedraza 9bd8c005da
Revert "Fix destination staleness issue when adding EndpointSlices (#12427)" (#12589)
This reverts commit 4fccf3e9ec.

The early return was causing `pp.addresses = newAddressSet` to not be run when the list of addresses is empty; but setting that is still necessary so that labels are tracked correctly.

This was caught by the tap (viz) integration test run in the release workflow.
2024-05-13 16:25:50 -07:00
Alejandro Pedraza 7dbafb26c8
Make IPv6 support opt-in (#12576)
This changes the default of the Helm value `disableIPv6` to `true`.
Additionally, the proxy's `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS` env var
is now set accordingly to that value.

This addresses an incompatibility with GKE introduced in last week's
edge (`edge-24.5.1`): in default IPv4-only nodes in GKE clusters the
proxy can't bind to `::1`, so we have make IPv6 opt-in to avoid
surprises.
2024-05-09 09:09:26 -05:00
Alejandro Pedraza 4fccf3e9ec
Fix destination staleness issue when adding EndpointSlices (#12427)
When updating the portPublisher's address set when a new EndpointSlice creation event is received, its addresses where getting overwritten with stale data whenever its IDs already existed in the current pp's address set.

There can be pathological cases in single-stack where this can be a problem. For example when ES get recycled but the deletion event is not caught for some reason, when the addition event is received its Address data will be overwritten by the old stale entry.

## Other Changes

- Remove overriding `newAddressSet.LocalTrafficPolicy` as that is already taken care inside `pp.endpointSliceToAddresses(slice)`.
- When there are no Add events to send, return early without updating state nor updating metrics.
2024-05-08 09:12:51 -05:00
Alex Leong d3e227fbd7
Fix flakey Handles_overflow test (#12555)
The `Handles overflow` test for the endpoint profile translator writes updates into the updates queue until it is full and then tests that no more updates can be enqueued.  However, since the test also starts the profile translator, it is concurrently draining updates off of the queue as well.  This leads to unpredictable results and test flakeyness.

We update the test to not start the translator so that updates are not drained off of the queue during the test.

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-05-07 13:06:58 -07:00
Alejandro Pedraza 137eac9df3
Add IPv6 support for the destination controller (#12428)
Services in dual-stack mode result in the creation of two EndpointSlices, one for each IP family. Before this change, the Get Destination API would nondeterministically return the address for any of those ES, depending on which one was processed last by the controller because they would overwrite each other.

As part of the ongoing effort to support IPv6/dual-stack networks, this change fixes that behavior giving preference to IPv6 addresses whenever a service exposes both families.

There are a new set of unit tests in server_ipv6_test.go, and in the TestEndpointTranslatorForPods tests there's a couple of new cases to test the interaction with zone filtering.
Also the server unit tests were updated to segregate the tests and resources dealing with the IPv4/IPv6/dual-stack cases.
2024-05-02 14:39:05 -05:00
Alejandro Pedraza 7cbe2f5ca6
Enable forwarding IPv6 connections through the proxy (#12495)
As part of the ongoing effort to support IPv6/dual-stack networks, this change
enables the proxy to properly forward IPv6 connections:

- Adds the new `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDRS` environment variable when
  injecting the proxy. This is supported as of proxy v2.228.0 which was just
  pulled into the linkerd2 repo in #2d5085b56e465ef56ed4a178dfd766a3e16a631d.
  This adds the IPv6 loopback address (`[::1]`) to the IPv4 one (`127.0.0.1`)
  so the proxy can forward outbound connections received via IPv6. The injector
  will still inject `LINKERD2_PROXY_OUTBOUND_LISTEN_ADDR` to support the rare
  case where the `proxy.image.version` value is overridden with an older
  version. The new proxy still considers that variable, but it's superseded by
  the new one. The old variable is considered deprecated and should be removed
  in the future.
- The values for `LINKERD2_PROXY_CONTROL_LISTEN_ADDR`,
  `LINKERD2_PROXY_ADMIN_LISTEN_ADDR` and `LINKERD2_PROXY_INBOUND_LISTEN_ADDR`
  have been updated to point to the IPv6 wildcard address (`[::]`) instead of
  the IPv4 one (`0.0.0.0`) for the same reason. Unlike with the loopback
  address, the IPv6 wildcard address suffices to capture both IPv4 and IPv6
  traffic.
- The endpoint translator's `getInboundPort()` has been updated to properly
  parse the IPv6 loopback address retrieved from the proxy container manifest.
  A unit test was added to validate the behavior.
2024-05-02 16:39:19 +01:00
Oliver Gould cae534957d
chore: Fix whitespace and typos (#12540)
No functional changes.
2024-05-02 16:30:51 +01:00
Kevin Ingelman 5f068bfbd7
Restore Server v1beta1 Go API definition (#12529)
The `v1beta1` Go API definition for Servers was removed in #11920, in favor of the `v1beta2` definition that was being added. For backwards compatibility, the `v1beta1` definition should have been left in place.

Signed-off-by: Kevin Ingelman <ki@buoyant.io>
2024-05-01 16:16:48 -07:00
Oliver Gould aef8a02426
feat(destination): Add meshed HTTP/2 keep-alive settings (#12504)
This commit adds destination controller configuration that enables default
keep-alives for meshed HTTP/2 clients.

This is accomplished by encoding the raw protobuf message structure into the
helm values, and then encoding that as JSON in the destination controller's
command-line options. This allows operators to set any supported HTTP/2 client
configuration without having to modify the destination controller.
2024-04-30 19:35:30 +00:00
Oliver Gould 246c62e7d3
feat: Configure default HTTP/2 server keep-alives (#12498)
HTTP/2 keep-alives enable HTTP/2 servers to issue PING messages to clients to
ensure that the connections are still healthy. This is especially useful when
the OS loses a FIN packet, which causes the connection resources to be held by
the proxy until we attempt to write to the socket again. Keep-alives provide a
mechanism to issue periodic writes on the socket to test the connection health.

5760ed2 updated the proxy to support HTTP/2 server configuration and 9e241a7
updated the proxy's Helm templating to support aribtrary configuration.
Together, these changes enable configuring default HTTP/2 server keep-alives
for all data planes in a cluster.

The default keep-alive configuration uses a 10s interval and a 3s timeout. These
are chosen to be conservative enough that they don't trigger false positive
timeouts, but frequent enough that these connections may be garbage collected
in a timely fashion.
2024-04-29 13:51:21 -07:00
knowmost 27bcdd1028
chore: fix function names in comment (#12512)
Signed-off-by: knowmost <knowmost@outlook.com>
2024-04-29 10:28:10 -07:00
Alejandro Pedraza 6db4bd667c
Fix issues with native sidecars (#12453)
Closes #12395

Failing to iterate over init containers as well as regular containers for finding the proxy in various parts of the code when the proxy is injected as a native sidecar resulted in:

- `Get` Destination API failing in the presence of opaque ports
- Failure having the injector detecting already injected pods
- Various CLI issues

This PR is split into the following commits addressing each issue separately:

a8ebe76e3 - Fix injection check for existing sidecars
44e9625e0 - Fix 'linkerd uninject'
62694965d - Fix 'linkerd version --proxy'
42dbdaddf - Fix 'linkerd identity'
39db823fe - Fix 'linkerd check'
7359f371d - Fix 'linkerd dg proxy-metrics'
f8f73c47c - Fix destination controller
2024-04-26 14:38:01 -05:00
Matei David 4ce461e967
Bump proxy-init and CNI plugin versions (#12462)
A new release has been cut for both. The new release adds a new `GID`
feature that allows iptables to skip traffic originating from a process
running under the specified GID. The CNI plugin also includes a fix for
native sidecar containers.

* Bump proxy-init from `v2.3.0` to `v2.4.0`
* Bump CNI plugin from `v1.4.0` to `v1.5.0`

---------

Signed-off-by: Matei David <matei@buoyant.io>
2024-04-19 10:50:28 +01:00
Matei David 38c6d11832
Change injector overriding logic to be more generic (#12405)
The proxy-injector package has a `ResourceConfig` type that is
responsible for parsing resources, applying overrides, and serialising a
series of configuration values to a Kubernetes patch. The functionality
is very concrete in its assumption; it always relies on a pod spec and
it mutates inner state when deciding on which overrides to apply.

This is not a flexible way to handle injection and configuration
overriding for other types of resources. We change this by turning
methods previously defined on `ResourceConfig` into free-standing
functions. These functions can be applied for any type of resources in
order to compute a set of configuration values based on annotation
overrides. Through the change, the functions can be used to compute
static configuration for non-Pod types or can be used in tests.


Signed-off-by: Matei David <matei@buoyant.io>
2024-04-10 15:51:58 +01:00
hanghuge 78d42b230d
chore: fix function name in comment (#12396)
Fixed comments for `subscribeToServicesWithContext` and `reconcileByAddressType`. Previously,
the comments contained incorrect function names.

Signed-off-by: hanghuge <cmoman@outlook.com>
2024-04-10 15:46:45 +01:00
Alejandro Pedraza 8b0e55ab38
Upgrade to proxy-init:v2.3.0 and linkerd-cni:1.4.0 (#12361) 2024-04-02 11:38:53 -05:00
occupyhabit 6eeaea4d94
chore: Remove repetitive words (#12330)
Signed-off-by: occupyhabit <wangmengjiao@outlook.com>
2024-03-25 09:33:39 -07:00
Hirotaka Tagawa / wafuwafu13 9a5284f453
controller: Stop logging errors on shutdown (#12167)
When a controller is shutdown, the admin server fails. This failure is logged
as an error, even when the shutdown was graceful.

This change updates the shutdown behavior to log more appropriately.

Signed-off-by: wafuwafu13 <jaruwafu@gmail.com>
2024-03-22 09:49:37 -07:00
Adarsh jaiswal ccdf6b74ed
injector: Stop emitting warnings about skipped resources (#12254)
The injector may try to resolve kinds that it does not know about.
When it does so, it logs a warning. In clusters with these unexpected
workload owners, the injector logs these warnings excessively.

This change reduces these logs to trace-level.

Signed-off-by: adarsh-jaiss <its.adarshjaiss@gmail.com>
2024-03-22 09:47:53 -07:00
Alex Leong 2c2a96bc73
Removes should not change local traffic policy (#12325)
Fixes: #12311

When the endpoint translator receives a `remove` call, it was updating it's local traffic policy based on the address set passed to remove.  However, since `remove` is only meant to remove addresses and not change the address metadata, the endpoints watcher was not setting local traffic policy on these calls to `remove`.  This can result in calls to `remove` temporarily turning off local traffic policy which will cause non-local addresses to be sent to clients.

Since `remove` should not change address metadata, we now disregard any metadata in the call to `remove`, including any changes to the local traffic policy.

Signed-off-by: Alex Leong <alex@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
2024-03-22 09:42:13 -07:00
Alejandro Pedraza b697e285a0
Refactor IPv4-only functions to also work for IPv6 (#12303)
The main change here is the refactoring of the address functions in `addr.go` that support the Destination controller and Viz's Tap controller. Some of those functions only worked for IPv4, so this change refactored them to make them IP family agnostic.
This enabled adding (and fixing) IPv6 unit tests as detailed in the following sections.

Other changes:

- The `ProxyAddressesToString()` function was no longer used, so it got removed.
- The `ProxyIPToString()` function was only used by the destination-client script, so that got stripped out.

## `addr_test.go`

We added IPv6 cases to each test, that would have failed previously.

## `endpoint_translator_test.go`

One of the test pods (pod3) was changed to have an IPv6. Without the other changes in this PR those tests would still have passed, but just because when comparing actual IPs with expected ones we weren't checking if they were both zero. So here we added checks against that.

## `server_test.go`

As above, we added checks against empty IPs. And in the mocked resources in `test_util.go` we added an IPv6 EndpointSlice.
2024-03-22 07:20:52 -05:00
Alex Leong 5915ef5a18
Don't send endpoint profile updates from Server updates when opaqueness doesn't change (#12013)
When the destination controller receives an update for a Server resource, we recompute opaqueness ports for all pods.  This results in a large number of updates to all endpoint profile watches, even if the opaqueness doesn't change.  In cases where there are many Server resources, this can result in a large number of updates being sent to the endpoint profile translator and overflowing the endpoint profile translator update queue.  This is especially likely to happen during an informer resync, since this will result in an informer callback for every Server in the cluster.

We refactor the workload watcher to not send these updates if the opaqueness has not changed.

This, seemingly simple, change in behavior requires a large code change because:
* the current opaqueness state is not stored on workload publishers and must be added so that we can determine if the opaqueness has changed
* storing the opaqueness in addition to the other state we're storing (pod, ip, port, etc.) means that we are not storing all of the data represented by the Address struct
* workload watcher uses a `createAddress` func to dynamically create an Address from the state it stores
* now that we are storing the Address as state, creating Addresses dynamically is no longer necessary and we can operate on the Address state directly
  * this makes the workload watcher more similar to other watchers and follow a common pattern
  * it also fixes some minor correctness issues:
    * pods that did not have the ready status condition were being considered when they should not have been
    * updates to ExternalWorkload labels were not being considered

Signed-off-by: Alex Leong <alex@buoyant.io>
2024-03-19 10:24:02 -07:00