When the proxy's TCP server encounters an error (usually due to one of
the connections failing, we log the error and the client's address. The
server's address was omitted because it varies based on context that is
not known in this module: in some cases it's the actual server address
on the socket, but when proxying a connection it may be determined by
the value retrieved from the SO_ORIGINAL_DST socket option.
To fix this, the server now requires that connection metadata be able to
materialize an 'AddrPair' parameter that describes a client-server
connection. The TCP listener impls are updated to satisfy this based on
the appropriate metadata; and the TCP server consumes this type to
include both client and server addresses in the relevant logs/contexts.
ecaaf39 changed the proxy's behavior with regard to creating [default
response classifiers][default]: the defaults used to support detecting
gRPC response (regardless of the request properties).
To fix this, we modify the metrics module that uses responses
classifiers to *require* them without inferring defaults. This enforces
the intended usage pattern so that we do not silently and implicitly
fall back to the default behavior.
This change also updates the `NewClassify` module that inserts the
response classifier request extension so that overrides are supported.
We then can install a default classifier early in request processing and
override it only if specified by a route configuration.
To support this change, the http-metrics crate is updated to support
querying response_total metrics without stringifying everything.
[default]: ecaaf39b46 (diff-372e8a8a57b1fad5d94f37d2f77fdc7a45bcf708782475424b75d671f99ea1a0L97-L103)
The controller client includes a recovery/backoff module that causes
resolutions to be retried when an unexpected error is encountered.
These events are only logged at debugging and trace log levels.
This change updates the destination and policy controller recovery
modules to log unexpected errors as warnings.
The gate middleware controls a service's readiness so that it can exert
back-pressure. This is used, for instance, by the circuit breaker module
so that an endpoint can go into an unavailable state after the breaker
has been tripped and be marked available again as it recovers.
This change fixes a bug in that recovery scenario: when the gate is in a
Limited state (i.e. when the circuit breaker puts an endpoint into
Probation to test its availability), and caller (i.e. the balancer) is
waiting for the endpoint to leave probation, the balancer may never be
notified that the endpoint has left its probation state.
To fix this, we update the gate controller to definitively close its
inner Semaphore when transitioning out of a limited state -- dropping
the semaphore in the sender doesn't close it when it's being held by a
receiver.
This issue is somewhat masked by the balancer's polling behavior, where
endpoint states are only advanced as requests are processed. It seems
likely, however, that this scenario could be encountered in the wild
when circuit breaking is enabled on a service.
If `Gate` becomes ready, it assumes the inner service remains ready
indefinitely.
Load balancers rely on lazy and redudant readiness checking to avoid
disconnected endpoints.
This change fixes the Gate to ensure that the inner service is always
polled whenever the gate is polled.
* chore: change `rust-toolchain` file to toml format
The `rust-toolchain` file containing only a Rust version number is
deprecated in favor of a TOML-formatted `rust-toolchain.toml`. Using the
old format seems to make Dependabot unhappy --- it complains that:
```
only rust-toolchain files formatted as TOML are supported, the non-TOML
format was deprecated by Rust
```
Therefore, this branch changes the toolchain file in this repo to the
TOML format. This required updating the CI workflows that check that
the toolchain matches to use a new regex.
328826caa updated the balancer's discovery channel to prevent backing up
into the discovery stream by dropping the discovery stream. This results
in balancers becoming permanently stale (should they ever be used
again).
This change modifies the discovery stream so that these errors are fatal
for the balancer. These errors are recorded distinctly by the error counters.
To fix this, we replace the `DiscoverNew` module with a
`discover::NewServices` module that wraps the buffering layer. The
buffer now only holds target metadata, and services are only built as
the entry is dequeued from channel.
This has the (positive) side-effect that the proxy's stack_create_total
metric will not be incremented before the balancer actually uses an
endpoint stack. Previously, this metric would be incremented for all
queued endpoint updates.
We also now log at INFO the address of all additions and removals from a
balancer. This should dramatically improve diagnostics in stale endpoint
situations.
This commit updates the proxy's dependency on `rustix` in order to
resolve a potential memory exhaustion issue when using the
`rustix::fs::Dir` iterator with the `linux-raw` backend. This issue is
described in GHSA-c827-hfw6-qwvm.
We currently depend on both `rustix` v0.36 and v0.37 as transitive deps,
so this branch updates the v0.36 dep from v0.36.14 to v0.36.16, and the
v0.37 dependency from v0.37.4 to v0.37.7.
Unfortunately, we weren't able to get Dependabot to bump these deps for
us, because it no longer supports the legacy (non-TOML) `rust-toolchain`
file (see #2487 for details). Therefore, we have to do this bump
manually.
In 6d2abbc, we changed how outbound proxies process discovery updates.
The prior implementation used a watchdog timeout to bound the amount of
time an update stream could be full. With that change, when an update
channel fills, the backpressure can extend to the destination
controller's gRPC response stream.
To detect and avoid this harmful (and useless) backpressure, this change
modifies the balancer's discovery processing stream to exit when the
balancer has 1000 unprocessed discovery updates. A sufficiently scary
warning is logged.
Fixes https://github.com/linkerd/linkerd2/issues/11449
The `grpc_status` metric label is rendered as a long form, human readable string value in the proxy metrics. For example:
```
response_total{direction="outbound", [...], classification="failure",grpc_status="Unknown error",error=""} 1
```
This is because of the Display impl for Code. We explicitly convert to an i32 so this renders as a number instead:
```
response_total{direction="outbound", [...] ,classification="failure",grpc_status="2",error=""} 1
```
Signed-off-by: Alex Leong <alex@buoyant.io>
Currently, if errors occur while parsing a client identity from a TLS
certificate, the `client_identity` function in `linkerd-meshtls-rustls`
will simply discard the error and return `None`. This means that we
cannot easily determine *why* a connection has no client identity ---
there may have been no client cert, but we may also have failed to parse
a client cert that was present.
In order to make debugging these issues a little easier, I've changed
this function to log any errors returned by `rustls-webpki` while
parsing client certs.
Currently, the proxy [depends on an outdated version of `rustls`][1],
v0.20.8. The `rustls` dependency is via our dependency on `tokio-rustls`
v0.23.4; we don't have a direct `rustls` dependency, in order to ensure
that the version of `rustls` is always the same version as used by
`tokio-rustls`. `rustls` also has a dependency on `webpki`, and v0.20.x
of `rustls` uses the original `webpki` crate, rather than the
`rustls-webpki` crate. So, unfortunately, because we have a transitive
dep on `webpki` via `rustls`, PR linkerd/linkerd2-proxy#2465 did not
remove _all_ `webpki` deps from our dependency tree, only the direct
dependency.
This branch updates to `rustls` v0.21.x, which depends on
`rustls-webpki` rather than `webpki`, removing the `webpki` dependency.
This is accomplished by updating `tokio-rustls` to v0.24.x, implicitly
updating the transitive `rustls` dep. In order to update to the
semver-incompatible version of `rustls`, it was necessary to modify our
code in order to track some breaking API changes. I've also added a
`cargo-deny` ban for `webpki` to our `deny.toml`, to ensure that we
always use the actively-maintained `rustls-webpki` crate rather than
`webpki` classic.
Since peer certificate validation is performed through `rustls` rather
than through the direct `rustls-webpki` dependency, this should
hopefully resolve issues with issuer certs that contain name constraints
--- these were not fixed by linkerd/linkerd2-proxy#2465, because the
failure with certs containing name constraints occurred inside of the
*`webpki` version depended on by `rustls`*, rather than inside of the
proxy's direct dep. See [this comment][2] for details.
In addition, it was necessary to update `rustls-webpki` to v0.101.6,
since v0.101.5 was yanked due to an accidental API breaking change.
<details>
<summary>Verifying that we no longer depend on `webpki`:</summary>
Before:
```console
$ cargo tree -p webpki -i
webpki v0.22.1
├── rustls v0.20.8
│ └── tokio-rustls v0.23.4
│ ├── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ └── linkerd-meshtls-rustls v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/meshtls/rustls)
│ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound)
│ │ ├── linkerd-app v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app)
│ │ │ ├── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ │ │ └── linkerd2-proxy v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd2-proxy)
│ │ ├── linkerd-app-admin v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/admin)
│ │ │ └── linkerd-app v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app) (*)
│ │ │ [dev-dependencies]
│ │ │ └── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ │ └── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway)
│ │ └── linkerd-app v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app) (*)
│ │ [dev-dependencies]
│ │ └── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway) (*)
│ ├── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound)
│ │ ├── linkerd-app v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app) (*)
│ │ └── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway) (*)
│ │ [dev-dependencies]
│ │ └── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway) (*)
│ └── linkerd-meshtls v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/meshtls)
│ ├── linkerd-app-core v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/core)
│ │ ├── linkerd-app v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app) (*)
│ │ ├── linkerd-app-admin v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/admin) (*)
│ │ ├── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway) (*)
│ │ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ │ ├── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ │ ├── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound) (*)
│ │ └── linkerd-app-test v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/test)
│ │ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ │ ├── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ │ └── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound) (*)
│ │ [dev-dependencies]
│ │ ├── linkerd-app-gateway v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/gateway) (*)
│ │ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ │ └── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound) (*)
│ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ ├── linkerd-proxy-tap v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/proxy/tap)
│ │ └── linkerd-app-core v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/core) (*)
│ └── linkerd2-proxy v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd2-proxy)
│ [dev-dependencies]
│ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ ├── linkerd-app-integration v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/integration)
│ └── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound) (*)
│ [dev-dependencies]
│ ├── linkerd-app-inbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/inbound) (*)
│ └── linkerd-app-outbound v0.1.0 (/home/eliza/Code/linkerd2-proxy/linkerd/app/outbound) (*)
└── tokio-rustls v0.23.4 (*)
```
After:
```console
$ cargo tree -p webpki -i
error: package ID specification `webpki` did not match any packages
```
</details>
[1]:
8afc72258b/Cargo.lock (L2450-L2460C2)
[2]:
https://github.com/linkerd/linkerd2/issues/9299#issuecomment-1730094953
Now that [v0.101.5 of `rustls-webpki`][1] has been [published][2], we
can now depend on the crate from crates.io. This allows us to remove the
Git dependency on the branch preparing that release to be published,
which allows us to remove the allowance for Git dependencies in the
`cargo-deny` config.
[1]: https://github.com/rustls/webpki/releases/tag/v%2F0.101.5
[2]: https://crates.io/crates/rustls-webpki/0.101.5
This commit changes the `linkerd-meshtls-rustls` crate to use the
upstream `rustls-webpki` crate, maintained by Rustls, rather than our
fork of `briansmith/webpki` from GitHub. Since `rustls-webpki` includes
the change which was the initial motivation for the `linkerd/webpki`
fork (rustls/webpki#42), we can now depend on upstream.
Currently, we must take a Git dependency on `rustls-webpki`, since a
release including a fix for an issue (rustls/webpki#167) which prevents
`rustls-webpki` from parsing our test certificates has not yet been
published. Once v0.101.5 of `rustls-webpki` is published (PR see
rustls/webpki#170), we can remove the Git dep. For now, I've updated
`cargo-deny` to allow the Git dependency.
The `linkerd-meshtls-boring` crate currently uses a Git dependency on
`boring` and `tokio-boring`. This is because, when this crate was
initially introduced, the proxy required unreleased changes to these
crates. Now, however, upstream has published all the changes we depended
on (this happened ages ago), and we can depend on these libraries from
crates.io.
This branch removes the Git deps and updates to v3.0.0 of
`boring`/`tokio-boring`. I've also changed the `cargo-deny` settings to
no longer allow Git deps on these crates, as we no longer depend on them
from Git.
In 2.13, the default inbound and outbound HTTP request queue capacity
decreased from 10,000 requests to 100 requests (in PR #2078). This
change results in proxies shedding load much more aggressively while
under high load to a single destination service, resulting in increased
error rates in comparison to 2.12 (see linkerd/linkerd2#11055 for
details).
This commit changes the default HTTP request queue capacities for the
inbound and outbound proxies back to 10,000 requests, the way they were
in 2.12 and earlier. In manual load testing I've verified that
increasing the queue capacity results in a substantial decrease in 503
Service Unavailable errors emitted by the proxy: with a queue capacity
of 100 requests, the load test described [here] observed a failure rate
of 51.51% of requests, while with a queue capacity of 10,000 requests,
the same load test observes no failures.
Note that I did not modify the TCP connection queue capacities, or the
control plane request queue capacity. These were previously configured
by the same variable before #2078, but were split out into separate vars
in that change. I don't think the queue capacity limits for TCP
connection establishment or for control plane requests are currently
resulting in instability the way the decreased request queue capacity
is, so I decided to make a more focused change to just the HTTP request
queues for the proxies.
[here]: https://github.com/linkerd/linkerd2/issues/11055#issuecomment-1650957357
The proxy currently emits very little useful version information.
This change updates the proxy to support new build-time environment
variables that are used to report version information:
* LINKERD2_PROXY_BUILD_TIME
* LINKERD2_PROXY_VENDOR
* LINKERD2_PROXY_VERSION
Additionally, several pre-existing Git-oriented metadata have been
removed, as they were generally redundant or uninformative. The Rustc
version has also been removed (since it has no real user-facing value
and can be easily determined by the version/tag).
When the outbound proxy resolves an outbound policy from the policy
controller's `OutboundPolicies` API, the policy controller may return an
error with the `grpc-status` code `NotFound` in order to indicate that
the destination is not a ClusterIP service. When this occurs, the proxy
will fall back to either using a ServiceProfile, if the ServiceProfile
contains non-trivial configuration, or synthesizing a default client
policy from the ServiceProfile.
However, when the outbound proxy is configured to run in ingress mode,
the fallback behavior does not occur. Instead, the ingress mode proxy
treats any error returned by the policy controller's `OutboundPolicies`
API as fatal. This means that when an ingress controller performs its
own load-balancing and opens a connection to a pod IP directly, the
ingress mode proxy will fail any requests on that connection. This is a
bug, and is the cause of the issues described in linkerd/linkerd2#10908.
This branch fixes this by changing the ingress mode proxy to handle
`NotFound` errors returned by the policy controller. I've added similar
logic for synthesizing default policies from a discovered
ServiceProfile, or using the profile if it's non-trivial. Unfortunately,
we can't just reuse the existing `Outbound::resolver` method, as ingress
discovery may be performed for an original destination address *or* for
a DNS name, and it's necessary to construct fallback policies in either
case. Instead, I've added a new function with similar behavior that's
ingress-specific.
I've manually tested this change against the repro steps[^1] described
in linkerd/linkerd2#10908, and verified that the proxy 503s on 2.13.4,
and that it once again routes correctly after applying this change.
Fixeslinkerd/linkerd2#10908.
[^1]: As described in the first comment, using Contour and podinfo.
PRs #2418 and #2419 add per-route and per-backend request timeouts
configured by the `OutboundPolicies` API to the `MatchedRoute` and
`MatchedBackend` layers in the outbound `ClientPolicy` stack,
respectively. This means that — unlike in the `ServiceProfile` stack —
two separate request timeouts can be configured in `ClientPolicy`
stacks. However, because both the `MatchedRoute` and `MatchedBackend`
layers are in the HTTP logical stack, the errors emitted by both
timeouts will have a `LogicalError` as their most specific error
metadata, meaning that the log messages and `l5d-proxy-error` headers
recorded for these timeouts do not indicate whether the timeout that
failed the request was the route request timeout or the backend request
timeout.
In order to ensure this information is recorded and exposed to the user,
this branch adds two new error wrapper types, one of which enriches an
error with a `RouteRef`'s metadata, and one of which enriches an error
with a `BackendRef`'s metadata. The `MatchedRoute` stack now wraps all
errors with `RouteRef` metadata, and the `MatchedBackend` stack wraps
errors with `BackendRef` metadata. This way, when the route timeout
fails a request, the error will include the route metadata, while when
the backend request timeout fails a request, the error will include both
the route and backend metadata.
Adding these new error wrappers also has the additional side benefit of
adding this metadata to errors returned by filters, allowing users to
distinguish between errors emitted by a filter on a route rule and
errors emitted by a per-backend filter. Also, any other errors emitted
lower in the stack for requests that are handled by a client policy
stack will now also include this metadata, which seems generally useful.
Example errors, taken from a proxy unit test:
backend request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: backend service.test.test-svc:666: HTTP response timeout after 1s
```
route request:
```
logical service logical.test.svc.cluster.local:666: route httproute.test.timeout-route: HTTP response timeout after 2s
```
Depends on #2418
The latest proxy-api release, v0.10.0, adds fields to the
`OutboundPolicies` API for configuring HTTP request timeouts, based on
the proposed changes to HTTPRoute in kubernetes-sigs/gateway-api#1997.
PR #2418 updates the proxy to depend on the new proxy-api release, and
implements the `Rule.request_timeout` field added to the API. However,
that branch does *not* add a timeout for the
`RouteBackend.request_timeout` field. This branch changes the proxy to
apply the backend request timeout when configured by the policy
controller.
This branch implements `RouteBackend.request_timeout` by adding an
additional timeout layer in the `MatchedBackend` stack. This applies the
per-backend timeout once a backend is selected for a route. I've also
added stack tests for the interaction between the request and backend
request timeouts.
Note that once retries are added to client policy stacks, it may be
necessary to move the backend request timeout to ensure it occurs
"below" retries, depending on where the retry middleware ends up being
located in the proxy stack.
The latest proxy-api release, v0.10.0, adds fields to the
`OutboundPolicies` API for configuring HTTP request timeouts, based on
the proposed changes to HTTPRoute in kubernetes-sigs/gateway-api#1997.
This branch updates the proxy-api dependency to v0.10.0 and adds the new
timeout configuration fields to the proxy's internal client policy
types. In addition, this branch adds a timeout middleware to the HTTP
client policy stack, so that the timeout described by the
`Rule.request_timeout` field is now applied.
Implementing the `RouteBackend.request_timeout` field with semantics as
close as possible to those described in GEP-1742 will be somewhat more
complex, and will be added in a separate PR.
The gRPC protocol always sets the HTTP response status code to 200 and instead communicates failures in a grpc-status header sent in a TRAILERS frame. Linkerd uses the HTTP response status code to determine if a response is successful, and therefore will consider all gRPC responses successful regardless of their gRPC status code. This means that functionality such as retries and circuit breaking do not function correctly with gRPC traffic.
We update the Http classifier to look for the presence of a `Content-Type: application/grpc` header and use Grpc response classification when it is set.
Signed-off-by: Alex Leong <alex@buoyant.io>
In the most recent stable versions, pods cannot communicate with themselves when using a ClusterIP. While direct (pod-to-pod) connections are never sent through the proxy and are skipped at the iptables level, connections to a logical service still pass through the proxy. When the chosen endpoint is the same as the source of the traffic, TLS and H2 upgrades should be skipped.
Every endpoint receives an h2 upgrade hint in its metadata. When looking into the problem, I noticed that client settings do not take into account that the target may be local. When deciding what client settings to use, we do not upgrade the connection when the hint is "unknown" (gatewayed connections) or "opaque". This change does a similar thing by using H1 settings when the protocol is H1 and the target IP is also part of the inbound IPs passed to the proxy.
Fixeslinkerd/linkerd2#10816
Signed-off-by: Matei David <matei@buoyant.io>
The W3C context propagation uses the wrong span ID right now. That
causes all spans emitted by linkerd-proxy to be siblings rather than
children of their original parent.
This only applies to W3C as far as I can tell, because the B3
propagation uses the span ID correctly.
Signed-off-by: Willi Schönborn <w.schoenborn@gmail.com>
The proc-macro ecosystem is in the middle of a migration from `syn` v1
to `syn` v2. Some crates (such as `tokio-macros`, `async-trait`,
`tracing-attributes`, etc) have been updated to v2, while others haven't
yet. This means that `cargo deny` will not currently permit us to update
some of those crates to versions that depend on `syn` v2, because they
will create a duplicate dependency.
Since `syn` is used by proc-macros (executed at compile time), duplicate
versions won't have an impact on the final binary size. Therefore, it's
fine to allow both v1 and v2 to coexist while the ecosystem is still
being gradually migrated to the new version.
If the policy controller is from a Linkerd version earlier than 2.13.x,
it will return the `Unimplemented` gRPC status code for requests to the
`OutboundPolicies` API. The proxy's outbound policy client will
currently retry this error code, rather than synthesizing a default
policy. Since 2.13.x proxies require an `OutboundPolicy` to be
discovered before handling outbound traffic, this means that 2.13.x
proxies cannot handle outbound connections when the control plane
is on an earlier version. Therefore, installing Linkerd 2.13 and then
downgrading to 2.12 can potentially break the data plane's ability to
route traffic.
In order to support downgrade scenarios, the proxy should also
synthesize a default policy when receiving an `Unimplemented` gRPC
status code from the policy controller. This branch changes the proxy to
do that. A warning is logged which indicates that the control plane
version is older than the proxy's.
The proxy injector populates an environment variable,
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION`, with a list
of all ports marked as opaque. Currently, however, _the proxy _does not
actually use this environment variable_. Instead, opaque ports are
discovered from the policy controller. The opaque ports environment
variable was used only when running in the "fixed" inbound policy mode,
where all inbound policies are determined from environment variables,
and no policy controller address is provided. This mode is no longer
supported, and the policy controller address is now required, so the
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` environment
variable is not currently used to discover inbound opaque ports.
There are two issues with the current state of things. One is that
inbound policy discovery is _non-blocking_: when an inbound proxy
receives a connection on a port that it has not previously discovered a
policy for, it uses the default policy until it has successfully
discovered a policy for that port from the policy controller. This means
that the proxy may perform protocol detection on the first connection to
an opaque port. This isn't great, as it may result in a protocol
detection timeout error on a port that the user had previously marked as
opaque. It would be preferable for the proxy to read the environment
variable, and use it to determine whether the default policy for a port
is opaque, so that ports marked as opaque disable protocol detection
even before the "actual" policy is discovered.
The other issue with the
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` environment
variable is that it is currently a list of _individual port numbers_,
while the proxy injector can accept annotations that specify _ranges_ of
opaque ports. This means that when a very large number of ports are
marked as opaque, the proxy manifest must contain a list of each
individual port number in those ranges, making it potentially quite
large. See linkerd/linkerd2#9803 for details on this issue.
This branch addresses both of these problems. The proxy is changed so
that it will once again read the
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` environment
variable, and use it to determine which ports should have opaque
policies by default. The parsing of the environment variable is changed
to support specifying ports as a list of ranges, rather than a list of
individual port numbers. Along with a proxy-injector change, this would
resolve the manifest size issue described in linkerd/linkerd2#9803.
This is implemented by changing the `inbound::policy::Store` type to
also include a set of port ranges that are marked as opaque. When the
`Store` handles a `get_policy` call for a port that is not already in
the cache, it starts a control plane watch for that port just as it did
previously. However, when determining the initial _default_ value for
the policy, before the control plane discovery provides one, it checks
whether the port is in a range that is marked as opaque, and, if it is,
uses an opaque default policy instead.
This approach was chosen rather than pre-populating the `Store` with
policies for all opaque ports to better handle the case where very large
ranges are marked as opaque and are used infrequently. If the `Store`
was pre-populated with default policies for all such ports, it would
essentially behave as though all ports in
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` were also in
`LINKERD2_PROXY_INBOUND_PORTS`, and the proxy would immediately start a
policy controller discovery watch for all opaque ports, which would be
kept open for the proxy's entire lifetime. In cases where the opaque
ports ranges include ~10,000s of ports, this causes significant
unnecessary load on the policy controller. Storing opaque port ranges
separately and using them to determine the default policy as needed
allows opaque port policies to be treated the same as non-default ports,
which are discovered as needed and can be evicted from the cache if they
are unused. If a port is in both
`LINKERD2_PROXY_INBOUND_PORTS_DISABLE_PROTOCOL_DETECTION` *and*
`LINKERD2_PROXY_INBOUND_PORTS`, the proxy will start discovery eagerly
and retain the port in the cache forever, but the default policy will be
opaque.
I've also added a test for the behavior of opaque ports where the port's
policy has not been discovered from the policy controller. That test
fails on `main`, as the proxy attempts protocol detection, but passes on
this branch.
In addition, I changed the parsing of the `LINKERD2_PROXY_INBOUND_PORTS`
environment variable to also accept ranges, because it seemed like a
nice thing to do while I was here. :)
Looks like we accidentally merged PR #2375 without a CI build against
the latest state of `main`. In the meantime since #2375 was last built
on CI, PR #2374 added an additional metadata field to
`policy::HttpParams`, which the `HttpParams` constructed in the test
added from #2375 doesn't populate. Therefore, merging this PR broke the
build. Whoops!
This commit populates the `meta` field, fixing it.
This branch adds a new test for failure accrual in load balancers with
multiple endpoints. This test asserts that endpoints whose circuit
breakers have tripped will not be selected by a load balancer.
Since upstream has yet to release a version with PR
bluejekyll/trust-dns#1881, this commit changes the proxy's default log
level to silence warnings from `trust_dns_proto` that are generally
spurious.
See linkerd/linkerd2#10123 for details.
Currently, the outbound proxy determines whether or not to perform
protocol detection based on the presence of the `opaque_protocol` field
on the resolved `ServiceProfile` from the Destination controller.
However, the `OutboundPolicy` resolved from the policy controller also
contains a `proxy_protocol` field that indicates what protocol should be
used for this destination. While the proxy uses the HTTPRoutes from the
`OutboundPolicy`'s `proxy_protocol`, it does _not_ take into account the
`proxy_protocol` when determining whether or not to perform protocol
detection. This can result in the outbound proxy performing protocol
detection on connections to destinations that have been marked as
opaque.
This branch modifies the outbound proxy to use the `proxy_protocol` from
the `OutboundPolicy`, as well as the `opaque_protocol` field from the
`ServiceProfile`, when determining whether or not to perform protocol
detection. In addition, I've added an integration test, which fails before
making the changes on this branch.
Fixeslinkerd/linkerd2#10745
The DOS mitigation changes in `h2` v0.3.17 inadvertantly introduced a
potential panic (hyperium/h2#674). Version 0.3.18 fixes this, so we
should bump the proxy's dependency to avoid panics.
Currently, when the outbound proxy makes a direct connection prefixed
with a `TransportHeader` in order to send HTTP traffic, it will always
send a `SessionProtocol` hint with the HTTP version as part of the
header. This instructs the inbound proxy to use that protocol, even if
the target port has a ServerPolicy that marks that port as opaque, which
can result in incorrect handling of that connection. See
linkerd/linkerd2#9888 for details.
In order to prevent this, linkerd/linkerd2-proxy-api#197 adds a new
`ProtocolHint` value to the protobuf endpoint metadata message. This
will allow the Destination controller to explicitly indicate to the
outbound proxy that a given endpoint is known to handle all connections
to a port as an opaque TCP stream, and that the proxy should not perform
a protocol upgrade or send a `SessionProtocol` in the transport header.
This branch updates the proxy to handle this new hint value, and adds
tests that the outbound proxy behaves as expected.
Along with linkerd/linkerd2#10301, this will fix linkerd/linkerd2#9888.
I opened a new PR for this change rather than attempting to rebase my
previous PR #2209, as it felt a bit easier to start with a new branch
and just make the changes that were still relevant. Therefore, this
closes#2209.