* chore(app/outbound): `linkerd-mock-http-body` test dependency
this adds a development dependency, so we can use this mock body type in
the outbound proxy's unit tests.
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(app/outbound): additional http route metrics tests
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(app/outbound): additional grpc route metrics tests
Signed-off-by: katelyn martin <kate@buoyant.io>
* fix(http/prom): record bodies when eos reached
this commit fixes a bug discovered by @alpeb, which was introduced in
proxy v2.288.0.
> The associated metric is `outbound_http_route_request_statuses_total`:
>
> ```
> $ linkerd dg proxy-metrics -n booksapp deploy/webapp|rg outbound_http_route_request_statuses_total.*authors
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="204",error=""} 5
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="201",error="UNKNOWN"} 5
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="200",error="UNKNOWN"} 10
> ```
>
> The problem was introduced in `edge-25.3.4`, with the proxy `v2.288.0`.
> Before that the metrics looked like:
>
> ```
> $ linkerd dg proxy-metrics -n booksapp deploy/webapp|rg outbound_http_route_request_statuses_total.*authors
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="200",error=""} 193
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="204",error=""} 96
> outbound_http_route_request_statuses_total{parent_group="core",parent_kind="Service",parent_namespace="booksapp",parent_name="authors",parent_port="7001",parent_section_name="",route_group="",route_kind="default",route_namespace="",route_name="http",hostname="",http_status="201",error=""} 96
> ```
>
> So the difference is the non-empty value for `error=UNKNOWN` even
> when `https_status` is 2xx, which `linkerd viz stat-outbound`
> interprets as failed requests.
in #3086 we introduced a suite of route- and backend-level metrics. that
subsystem contains a body middleware that will report itself as having
reached the end-of-stream by delegating directly down to its inner
body's `is_end_stream()` hint.
this is roughly correct, but is slightly distinct from the actual
invariant: a `linkerd_http_prom::record_response::ResponseBody<B>` must
call its `end_stream` helper to classify the outcome and increment the
corresponding time series in the
`outbound_http_route_request_statuses_total` metric family.
in #3504 we upgraded our hyper dependency. while doing so, we neglected
to include a call to `end_stream` if a data frame is yielded and the
inner body reports itself as having reached the end-of-stream.
this meant that instrumented bodies would be polled until the end is
reached, but were being dropped before a `None` was encountered.
this commit fixes this issue in two ways, to be defensive:
* invoke `end_stream()` if a non-trailers frame is yielded, and the
inner body now reports itself as having ended. this restores the
behavior in place prior to #3504. see the relevant component of that
diff, here:
<https://github.com/linkerd/linkerd2-proxy/pull/3504/files#diff-45d0bc344f76c111551a8eaf5d3f0e0c22ee6e6836a626e46402a6ae3cbc0035L262-R274>
* rather than delegating to the inner `<B as Body>::is_end_stream()`
method, report the end-of-stream being reached by inspecting whether
or not the inner response state has been taken. this is the state that
directly indicates whether or not the `ResponseBody<B>` middleware is
finished.
X-ref: #3504
X-ref: #3086
X-ref: linkerd/linkerd2#8733
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit removes the `linkerd-http-executor` crate, and replaces all
usage of its `TracingExecutor` type with the `TokioExecutor` type
provided by `hyper-util`.
this work is based upon hyperium/hyper-util#166. that change, included
in the 0.1.11 release, altered the `TokioExecutor` type so that it
propagates tracing context when the `tracing` feature is enabled.
with that change made, our `TracingExecutor` type is now redundant.
* https://github.com/hyperium/hyper-util/pull/166
* https://github.com/hyperium/hyper-util/blob/master/CHANGELOG.md#0111-2025-03-31
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit introduces a new metric family tracking the rate and outcome
of dns lookups made by the linkerd2 proxy. this metric family has three
labels, counting the number of DNS resolutions for each distinct
control plane client, by record type (A/AAAA or SRV), and by outcome
(success or failure).
this metric is named `control_dns_resolutions_total`.
this commit generally does this via the addition of some new interfaces
to `linkerd-dns`'s `Resolver` structure. the `resolve_addrs()` method is
extended to increment particular counters if they have been installed.
the `linkerd-app` crate's `Dns` type now encapsulates its resolver, and
callers acquire a new resolver by providing a client name to its
`resolver()` method. this uses the client name to construct label sets
and create the corresponding time series for each client.
once proxies with this patch are running, and the viz extension has been
installed, one can query this metric like so:
**nb:** this screenshot shows an early prototype, this metric has since
been renamed.

this promQL query...
```
sum(rate(control_dns_resolutions_total[1m])) by (app,client,result) > 0
```
...will show the per-minute rate of dns lookups/failures across each
application workload, for each control-plane client, for each possible
outcome.
Signed-off-by: katelyn martin <kate@buoyant.io>
Co-authored-by: Oliver Gould <ver@buoyant.io>
In linkerd/linkerd2-proxy#3547, we removed unsafe authority labels. This was a
breaking change, since the behavior was considered unsafe.
To support a graceful migration, this change adds an environment configuration,
`LINKERD2_PROXY_INBOUND_AUTHORITY_LABELS=unsafe`, that reverts to the prior
behavior.
It may be configured in linkerd2 via the proxy.additionalEnv helm value.
this commit changes a message for a debug-level tracing event.
this block builds a trace collector. we can call it that, instead of the
more generic term "client". there are many clients being built here,
including identity, policy, and destination controller clients.
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit fixes some broken links now that we have updated to the
latest 1.0 version of `http-body`.
this should address some warnings that can be seen in pull requests'
"files" tab in github. see, for example:
`https://github.com/linkerd/linkerd2-proxy/pull/3818/files`.
Signed-off-by: katelyn martin <kate@buoyant.io>
`LINKERD2_PROXY_RESOLV_CONF` is an environment variable that ostensibly
is used to set the path of the resolver configuration file.
this connects to a `resolv_conf_path` field in the application's dns
`Config` structure, but that field is never used.
because it is marked as public, this isn't caught by the compiler's dead
code analysis.
see `resolv.conf(5)` for more information.
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit addresses a todo comment in the `linkerd-proxy-resolve`
crate. this comment mentioned that a `match` block was originally an `if
let` block. a clippy lint is locally ignored as well, regarding `match`
statements with a single pattern.
contrary to the comment, `if let` *does* work with pin projection, as of
today.
Signed-off-by: katelyn martin <kate@buoyant.io>
DNS servers may return extremely low TTLs in some cases. When we're polling DNS to power a load balancer, we need to enforce a minimum duration to prevent tight-looping DNS queries.
This change adds a 5s minimum time between DNS lookups when resolving control plane components.
fixeslinkerd/linkerd2#13508
* build(deps): bump deranged from 0.4.0 to 0.4.1
Bumps [deranged](https://github.com/jhpratt/deranged) from 0.4.0 to 0.4.1.
- [Commits](https://github.com/jhpratt/deranged/commits)
---
updated-dependencies:
- dependency-name: deranged
dependency-type: indirect
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* fix(proxy/tap): fix inference error
https://github.com/jhpratt/deranged/issues/19
`deranged` added some additional interfaces in 0.4.1 that seem to affect
this `Into<T>` invocation. use `From::from` instead, so we can
explicitly indicate that we wish to convert this into an integer for
comparison.
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: katelyn martin <kate@buoyant.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: katelyn martin <kate@buoyant.io>
* chore(deps)!: upgrade to tower 0.5
this commit updates our tower dependency from 0.4 to 0.5.
note that this commit does not affect the `tower-service` and
`tower-layer` crates, reëxported by `tower` itself. the `Service<T>`
trait and the closely related `Layer<S>` trait have not been changed.
the `tower` crate's utilities have changed in various ways, some of
particular note for the linkerd2 proxy. see these items, excerpted from
the tower changelog:
- **retry**: **Breaking Change** `retry::Policy::retry` now accepts `&mut Req` and `&mut Res` instead of the previous mutable versions. This
increases the flexibility of the retry policy. To update, update your method signature to include `mut` for both parameters. ([tower-rs/tower#584])
- **retry**: **Breaking Change** Change Policy to accept &mut self ([tower-rs/tower#681])
- **retry**: **Breaking Change** `Budget` is now a trait. This allows end-users to implement their own budget and bucket implementations. ([tower-rs/tower#703])
- **util**: **Breaking Change** `Either::A` and `Either::B` have been renamed `Either::Left` and `Either::Right`, respectively. ([tower-rs/tower#637])
- **util**: **Breaking Change** `Either` now requires its two services to have the same error type. ([tower-rs/tower#637])
- **util**: **Breaking Change** `Either` no longer implemenmts `Future`. ([tower-rs/tower#637])
- **buffer**: **Breaking Change** `Buffer<S, Request>` is now generic over `Buffer<Request, S::Future>.` ([tower-rs/tower#654])
see:
* <https://github.com/tower-rs/tower/pull/584>
* <https://github.com/tower-rs/tower/pull/681>
* <https://github.com/tower-rs/tower/pull/703>
* <https://github.com/tower-rs/tower/pull/637>
* <https://github.com/tower-rs/tower/pull/654>
the `Either` trait bounds are particularly impactful for us. because
this runs counter to how we treat errors (skewing towards boxed errors,
in general), we temporarily vendor a version of `Either` from the 0.4
release, whose variants have been renamed to match the 0.5 interface.
updating to box the inner `A` and `B` services' errors, so we satiate
the new `A::Error = B::Error` bounds, can be addressed as a follow-on.
that's intentionally left as a separate change, due to the net size of
our patchset between this branch and #3504.
* <https://github.com/tower-rs/tower/compare/v0.4.x...master>
* <https://github.com/tower-rs/tower/blob/master/tower/CHANGELOG.md>
this work is based upon #3504. for more information, see:
* https://github.com/linkerd/linkerd2/issues/8733
* https://github.com/linkerd/linkerd2-proxy/pull/3504
Signed-off-by: katelyn martin <kate@buoyant.io>
X-Ref: https://github.com/tower-rs/tower/pull/815
X-Ref: https://github.com/tower-rs/tower/pull/817
X-Ref: https://github.com/tower-rs/tower/pull/818
X-Ref: https://github.com/tower-rs/tower/pull/819
* fix(stack/loadshed): update test affected by tower-rs/tower#635
this commit updates a test that was affected by breaking changes in
tower's `Buffer` middleware. see this excerpt from the description of
that change:
> I had to change some of the integration tests slightly as part of this
> change. This is because the buffer implementation using semaphore
> permits is _very subtly_ different from one using a bounded channel. In
> the `Semaphore`-based implementation, a semaphore permit is stored in
> the `Message` struct sent over the channel. This is so that the capacity
> is used as long as the message is in flight. However, when the worker
> task is processing a message that's been recieved from the channel,
> the permit is still not dropped. Essentially, the one message actively
> held by the worker task _also_ occupies one "slot" of capacity, so the
> actual channel capacity is one less than the value passed to the
> constructor, _once the first request has been sent to the worker_. The
> bounded MPSC changed this behavior so that capacity is only occupied
> while a request is actually in the channel, which broke some tests
> that relied on the old (and technically wrong) behavior.
bear particular attention to this:
> The bounded MPSC changed this behavior so that capacity is only
> occupied while a request is actually in the channel, which broke some
> tests that relied on the old (and technically wrong) behavior.
that pr adds an additional message to the channel in tests exercising
the laod-shedding behavior, on account of the previous (incorrect)
behavior.
https://github.com/tower-rs/tower/pull/635/files#r797108274
this commit performs the same change for our corresponding test, adding
an additional `ready()` call before we hit the buffer's limit.
Signed-off-by: katelyn martin <kate@buoyant.io>
* review: use vendored `Either` for consistency
https://github.com/linkerd/linkerd2-proxy/pull/3744#discussion_r1999878537
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: katelyn martin <kate@buoyant.io>
In #3626, we refactored the origin_dst determination logic to utilize
socket2 calls. However, this change inadvertently disrupted IPv6 and
dual-stack support, causing the server to fail to start when deployed on
such network configurations:
```
WARN ThreadId(01) inbound: linkerd_app_core::serve: Server failed to accept connection error=No such file or directory (os error 2)
```
This change reintroduces detection of the current network family,
calling socket2's `original_dst()` or `original_dst_ipv6()` depending on
the case.
Tested fine in both IPv6 and dual-stack Kind clusters.
this golfs down the return expression in
`NameRef::try_from_ascii_str()`.
rather than binding our `s` to a temporary variable, in order to return
a `Self(s)` result, we can take the same result and use `Result::map` to
convert a `Result<&'a str, InvalidName>` to a
`Result<NameRef<'a>, InvalidName>`.
Signed-off-by: katelyn martin <kate@buoyant.io>
* build(deps): bump the hickory group with 2 updates
Bumps the hickory group with 2 updates: [hickory-resolver](https://github.com/hickory-dns/hickory-dns) and [hickory-proto](https://github.com/hickory-dns/hickory-dns).
Updates `hickory-resolver` from 0.24.4 to 0.25.1
- [Release notes](https://github.com/hickory-dns/hickory-dns/releases)
- [Changelog](https://github.com/hickory-dns/hickory-dns/blob/main/OLD-CHANGELOG.md)
- [Commits](https://github.com/hickory-dns/hickory-dns/compare/v0.24.4...v0.25.1)
Updates `hickory-proto` from 0.24.4 to 0.25.1
- [Release notes](https://github.com/hickory-dns/hickory-dns/releases)
- [Changelog](https://github.com/hickory-dns/hickory-dns/blob/main/OLD-CHANGELOG.md)
- [Commits](https://github.com/hickory-dns/hickory-dns/compare/v0.24.4...v0.25.1)
---
updated-dependencies:
- dependency-name: hickory-resolver
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: hickory
- dependency-name: hickory-proto
dependency-type: indirect
update-type: version-update:semver-minor
dependency-group: hickory
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore(dns): address breaking changes in `hickory-resolver`
see also #3782.
this commit addresses breaking changes in the v0.25.0 release of
`hickory-resolver`, used by our `linkerd-dns` crate to handle DNS
resolution.
see the release notes, here:
<https://github.com/hickory-dns/hickory-dns/releases/tag/v0.25.0>
> 0.25.0 represents a large release for the Hickory DNS project. Over 14
> months since 0.24.0, we've [..] addressed a number of findings from our
> first security audit.
changes that are relevant to us include:
> * Support for TLS using native-tls or OpenSSL has been removed. We now
> only provide first-party support for rustls (0.23, for DNS over TLS,
> HTTP/2, QUIC and HTTP/3). We support ring or aws-lc-rs for
> cryptographic operations both for DNSSEC and TLS. The
> dns-over-rustls,dns-over-native-tls, dns-over-openssl,
> dns-over-https-rustls, dns-over-https, dns-over-quic and dns-over-h3
> features have been removed in favor of a set of
> {tls,https,quic,h3}-{aws-lc-rs,ring} features across our library
> crates.
>
> * The synchronous API in the resolver and client crates, which
> previously provided a thin partial wrapper over the asynchronous
> API, has been removed. Downstream users will have to migrate to the
> asynchronous API.
>
> * Error types are now exposed directly in the crate roots.
this commit updates references to the
`hickory_resolver::error::ResolveError` error with
`hickory_resolver::ResolveError` now that the errors submodule is
private. (hickory-dns/hickory-dns#2530)
this commit replaces references to
`hickory_resolver::TokioAsyncResolver` with its new name,
`hickory_resolver::TokioResolver`. (hickory-dns/hickory-dns#2521)
this commit inspects "no records found" errors according to the new api.
this particular change isn't especially documented, explicitly, but
occurred in hickory-dns/hickory-dns#2094. see in particular, in that
respect, corresponding changes in the upstream repo's own code. for
example: https://github.com/hickory-dns/hickory-dns/pull/2094/files#diff-330847b46040a30d449f85e8a804bea085f0974d3cba80d79d83acc56f33542dL176-R178
```diff
- match error.kind() {
- ResolveErrorKind::NoRecordsFound { query, soa, .. } => {
+ match error.proto().map(ProtoError::kind) {
+ Some(ProtoErrorKind::NoRecordsFound { query, soa, .. }) => {
```
there is a small pull request being proposed upstream to introduce a
`Builder::with_options()` method, which would make our construction of a
dns resolver marginally more idiomatic. this however, is not a blocker,
by any means.
X-Ref: hickory-dns/hickory-dns#2521
X-Ref: hickory-dns/hickory-dns#2830
X-Ref: hickory-dns/hickory-dns#2094
X-Ref: hickory-dns/hickory-dns#2877
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: katelyn martin <kate@buoyant.io>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
this branch is motivated by [review feedback](https://github.com/linkerd/linkerd2-proxy/pull/3504#discussion_r1999706761) from #3504. see
linkerd/linkerd2#8733 for more information on upgrading `hyper`. there,
we asked:
> I wonder if we should be a little more defensive about cloning [`HttpConnect`]. What does cloning it mean? When handling a CONNECT request, we can't clone the request, really. (Technically, we can't clone the body, but practically, it means we can't clone the request). Can we easily track whether this was accidentally cloned (i.e. with a custom Clone impl or Arc or some such) and validate at runtime (i.e., in proxy::http::h1) that everything is copacetic?
`linkerd-http-upgrade` provides a `HttpConnect` type that is intended
for use as a response extension. this commit performs a refactor,
removing this type.
we use this extension in a single piece of tower middleware. typically,
these sorts of extensions are intended for e.g. passing state between
distinct layers of tower middleware, or otherwise facilitating
extensions to the HTTP family of protocols.
this extension is only constructed and subsequently referenced within a
single file, in the `linkerd_proxy_http::http::h1::Client`. we can
perform the same task by using the `is_http_connect` boolean we use to
conditionally insert this extension.
then, this branch removes a helper function for a computation whose
amortization is no longer as helpful. now that we are passing
`is_http_connect` down into this function, we are no longer inspecting
the response's extensions. because of that, the only work to do is to
check the status code, which is a very cheap comparison.
this also restates an `if version != HTTP_11 { .. }` conditional block as
a match statement. this is a code motion change, none of the inner blocks
are changed.
reviewers are encouraged to examine this branch commit-by-commit; because
of the sensitivity of this change, this refactor is performed in small,
methodical changes.
for posterity, i've run the linkerd/linkerd2 test suite against this branch, as of
57dd7f4a60.
---
* refactor(http/upgrade): remove `HttpConnect` extension
`linkerd-http-upgrade` provides a `HttpConnect` type that is intended
for use as a response extension. this commit performs a refactor,
removing this type.
we use this extension in a single piece of tower middleware. typically,
these sorts of extensions are intended for e.g. passing state between
distinct layers of tower middleware, or otherwise facilitating
extensions to the HTTP family of protocols.
this extension is only constructed and subsequently referenced within a
single file, in the `linkerd_proxy_http::http::h1::Client`. we can
perform the same task by using the `is_http_connect` boolean we use to
conditionally insert this extension.
Signed-off-by: katelyn martin <kate@buoyant.io>
* refactor(proxy/http): fold helper function
this removes a helper function for a computation whose amortization is
no longer as helpful.
now that we are passing `is_http_connect` down into this function, we
are no longer inspecting the response's extensions. because of that, the
only work to do is to check the status code, which is a very cheap
comparison.
Signed-off-by: katelyn martin <kate@buoyant.io>
* refactor(proxy/http): match on response status
this commit refactors a sequence of conditional blocks in a helper
function used to identity HTTP/1.1 upgrades.
this commit replaces this sequence of conditional blocks with a match
statement.
Signed-off-by: katelyn martin <kate@buoyant.io>
* nit(proxy/http): rename `res` to `rsp`
we follow a convention where we tend to name responses `rsp`, not `res`
or `resp`. this commit applies that convention to this helper function.
Signed-off-by: katelyn martin <kate@buoyant.io>
* nit(proxy/http): import `Version`
Signed-off-by: katelyn martin <kate@buoyant.io>
* refactor(proxy/http): match on http version
this restates an `if version != HTTP_11 { .. }` conditional block as a
match statement.
this is a code motion change, none of the inner blocks are changed.
Signed-off-by: katelyn martin <kate@buoyant.io>
* refactor(proxy/http): add comments on http/1.1
this commit adds a brief comment noting that upgrades are a concept
specific to http/1.1.
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: katelyn martin <kate@buoyant.io>
Outbound hostname metrics were recently disabled. This conditionally re-enables those through a `LINKERD2_PROXY_OUTBOUND_METRICS_HOSTNAME_LABELS` env var, wired through the policy/routing config with the option of individual policies and routes to set this separately from the global config.
Signed-off-by: Scott Fleener <scott@buoyant.io>
this commit adds a `[workspace.package]` table at the root of the cargo
workspace. constituent manifests are updated to use the workspace-level
metadata.
this is generally a superficial chore, but has a pleasant future upside:
when new rust editions are released (e.g. 2024), we will only need to
update the edition specified at the root of the workspace.
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit performs a small refactor to one of the unit tests in
`linkerd-stack`'s load-shedding middleware.
this adds a span to the worker tasks spawned in this test, so that
tracing logs can be associated with particular oneshot services.
see #3744 for more information on upgrading our tower dependency. this
is cherry-picked from investigations on that branch related to breaking
changes in 0.5 related to the `Buffer` middleware.
after this change, logs now look like this:
```
; RUST_LOG="trace" cargo test -p linkerd-stack buffer_load_shed -- --nocapture
running 1 test
[ 0.002770s] TRACE worker{id=oneshot1}: tower::buffer::service: sending request to buffer worker
[ 0.002809s] TRACE worker{id=oneshot2}: tower::buffer::service: sending request to buffer worker
[ 0.002823s] TRACE worker{id=oneshot3}: tower::buffer::service: sending request to buffer worker
[ 0.002843s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service has become unavailable
[ 0.002851s] DEBUG worker{id=oneshot4}: linkerd_stack::loadshed: Service shedding load
[ 0.002878s] TRACE tower::buffer::worker: worker polling for next message
[ 0.002885s] TRACE tower::buffer::worker: processing new request
[ 0.002892s] TRACE worker{id=oneshot1}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[ 0.002901s] DEBUG worker{id=oneshot1}: tower::buffer::worker: service.ready=true processing request
[ 0.002914s] TRACE worker{id=oneshot1}: tower::buffer::worker: returning response future
[ 0.002926s] TRACE tower::buffer::worker: worker polling for next message
[ 0.002931s] TRACE tower::buffer::worker: processing new request
[ 0.002935s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[ 0.002946s] TRACE worker{id=oneshot2}: tower::buffer::worker: service.ready=false delay
[ 0.002983s] TRACE worker{id=oneshot5}: tower::buffer::service: sending request to buffer worker
[ 0.003001s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service has become unavailable
[ 0.003007s] DEBUG worker{id=oneshot6}: linkerd_stack::loadshed: Service shedding load
[ 0.003017s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service has become unavailable
[ 0.003024s] DEBUG worker{id=oneshot7}: linkerd_stack::loadshed: Service shedding load
[ 0.003035s] TRACE tower::buffer::worker: worker polling for next message
[ 0.003041s] TRACE tower::buffer::worker: resuming buffered request
[ 0.003045s] TRACE worker{id=oneshot2}: tower::buffer::worker: resumed=true worker received request; waiting for service readiness
[ 0.003052s] DEBUG worker{id=oneshot2}: tower::buffer::worker: service.ready=true processing request
[ 0.003060s] TRACE worker{id=oneshot2}: tower::buffer::worker: returning response future
[ 0.003068s] TRACE tower::buffer::worker: worker polling for next message
[ 0.003073s] TRACE tower::buffer::worker: processing new request
[ 0.003077s] TRACE worker{id=oneshot3}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[ 0.003084s] DEBUG worker{id=oneshot3}: tower::buffer::worker: service.ready=true processing request
[ 0.003091s] TRACE worker{id=oneshot3}: tower::buffer::worker: returning response future
[ 0.003099s] TRACE tower::buffer::worker: worker polling for next message
[ 0.003103s] TRACE tower::buffer::worker: processing new request
[ 0.003107s] TRACE worker{id=oneshot5}: tower::buffer::worker: resumed=false worker received request; waiting for service readiness
[ 0.003114s] DEBUG worker{id=oneshot5}: tower::buffer::worker: service.ready=true processing request
[ 0.003121s] TRACE worker{id=oneshot5}: tower::buffer::worker: returning response future
[ 0.003129s] TRACE tower::buffer::worker: worker polling for next message
test loadshed::tests::buffer_load_shed ... ok
```
Signed-off-by: katelyn martin <kate@buoyant.io>
this commit replaces `humantime`, which is no longer maintained, with
`jiff`.
see this error when `main` today is built:
```
error[unmaintained]: humantime is unmaintained
┌─ /linkerd/linkerd2-proxy/Cargo.lock:78:1
│
78 │ humantime 2.1.0 registry+https://github.com/rust-lang/crates.io-index
│ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unmaintained advisory detected
│
├ ID: RUSTSEC-2025-0014
├ Advisory: https://rustsec.org/advisories/RUSTSEC-2025-0014
├ Latest `humantime` crates.io release is four years old and GitHub repository has
not seen commits in four years. Question about maintenance status has not gotten
any reaction from maintainer: https://github.com/tailhook/humantime/issues/31
## Possible alternatives
* [jiff](https://crates.io/crates/jiff) provides same kind of functionality
├ Announcement: https://github.com/tailhook/humantime/issues/31
├ Solution: No safe upgrade is available!
├ humantime v2.1.0
└── linkerd-http-access-log v0.1.0
└── linkerd-app-inbound v0.1.0
├── linkerd-app v0.1.0
│ ├── linkerd-app-integration v0.1.0
│ └── linkerd2-proxy v0.1.0
├── linkerd-app-admin v0.1.0
│ ├── linkerd-app v0.1.0 (*)
│ └── (dev) linkerd-app-integration v0.1.0 (*)
└── linkerd-app-gateway v0.1.0
└── linkerd-app v0.1.0 (*)
advisories FAILED, bans ok, licenses ok, sources ok
```
see:
* https://github.com/rustsec/advisory-db/pull/2249.
* https://github.com/tailhook/humantime/issues/31.
Signed-off-by: katelyn martin <kate@buoyant.io>
kubert-prometheus-process is a new crate that includes all of Linkerd's system
metrics and more. This also helps avoid annoying compilation build issues on
non-Linux systems.
this updates the prometheus client dependency.
additionally, this commit updates the `kubert-prometheus-tokio`
dependency, so that we agree on the client library in use.
Signed-off-by: katelyn martin <kate@buoyant.io>
When the proxy boots up, it needs to select a number of I/O worker threads to
allocate to the runtime. This change adds a new environment variable that allows
this value to scale based on the number of CPUs available on on the host.
A CORES_MAX_RATIO value of 1.0 will allocate one worker thread per CPU core. A
lesser value will allocate fewer worker threads. Values are rounded to the
nearest whole number.
The CORES_MIN value sets a lower bound on the number of worker threads to use.
The CORES_MAX value sets an upper bound.
The outbound proxy makes protocol decisions based on the discovery response,
keyed on a "parent" reference.
This change adds a `protocol::metrics` middleware that records connection counts
by parent reference.
Inbound proxies may receive meshed traffic directly on the proxy's inbound port
with a transport header, informing inbound routing behavior.
This change updates the inbound proxy to record metrics about the usage of
transport headers, including the total number of requests with a transport
header by session protocol and target port.
This change updates the DetectHttp middleware to record metrics about HTTP
protocol detection. Specfically, it records the the counts of results and a very
coarse histogram of the time taken to detect the protocol.
The inbound, outbound, and admin (via inbound) stacks are updated to record
metrics against the main registry.
* refactor(http): consolidate HTTP protocol detection
Linkerd's HTTP protocol detection logic is spread across a few crates: the
linkerd-detect crate is generic over the actual protocol detection logic, and
the linkerd-proxy-http crate provides an implementation. There are no other
implemetations of the Detect interface. This leads to gnarly type signatures in
the form `Result<Option<http::Variant>, DetectTimeoutError>`: simultaneously
verbose and not particularly informative (what does the None case mean exactly).
This commit introduces a new crate, `linkerd-http-detect`, consolidating this
logic and removes the prior implementations. The admin, inbound, and outbound
stacks are updated to use these new types. This work is done in anticipation of
introducing metrics that report HTTP detection behavior.
There are no functional changes.
* feat(http/detect)!: error when the socket is closed
When a proxy does protocol detection, the initial read may indicate that the
connection was closed by the client with no data being written to the socket. In
such a case, the proxy continues to process the connection as if may be proxied,
but we expect this to fail immediately. This can lead to unexpected proxy
behavior: for example, inbound proxies may report policy denials.
To address this, this change surfaces an error (as if the read call failed).
This could, theoretically, impact some bizarre clients that initiate half-open
connections. These corner cases can use explicit opaque policies to bypass
detection.
We include a group/version/kind for inbound server resources, but we do not
indicate which specific port the server is applied to. This is important context
to understand the inbound proxy's behavior, especially when using the default
servers.
This change adds a `srv_port` label to inbound server metrics to definitively
and consistently indicate the server port used for inbound policy.
The RefusedNoTarget error type is a remnant of an older version of the direct
stack. This commit updates the error message to reflect the current state of the
code: we require ALPN-negotiated transport headers on all direct connections.
Linkerd's HTTP protocol detection logic is spread across a few crates: the
linkerd-detect crate is generic over the actual protocol detection logic, and
the linkerd-proxy-http crate provides an implementation. There are no other
implemetations of the Detect interface. This leads to gnarly type signatures in
the form `Result<Option<http::Variant>, DetectTimeoutError>`: simultaneously
verbose and not particularly informative (what does the None case mean exactly).
This commit introduces a new crate, `linkerd-http-detect`, consolidating this
logic and removes the prior implementations. The admin, inbound, and outbound
stacks are updated to use these new types. This work is done in anticipation of
introducing metrics that report HTTP detection behavior.
There are no functional changes.
pr #3715 missed a small handful of cargo dependencies. this commit marks
these so that they also use the workspace-level tower version.
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(deps): `tower` is a workspace dependency
see https://github.com/linkerd/linkerd2/issues/8733 for more
information.
see https://github.com/linkerd/linkerd2-proxy/pull/3504 as well.
see #3456 (c740b6d8), #3466 (ca50d6bb), #3473 (b87455a9), and #3701
(cf4ef39) for some other previous pr's that moved dependencies to be
managed at the workspace level.
see also https://github.com/linkerd/drain-rs/pull/36 for another related
pull request that relates to our tower dependency.
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(deps): `tower-service` is a workspace dependency
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(deps): `tower-test` is a workspace dependency
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: katelyn martin <kate@buoyant.io>
noticed while addressing `cargo-deny` errors in #3504. these crates
include a few unused dependencies, which we can remove. while we
are in the neighborhood, we make some subjective tweaks to tidy up
these imports.
---
* chore(opentelemetry): remove unused `http` dependency
Signed-off-by: katelyn martin <kate@buoyant.io>
* nit(opentelemetry): tidy imports
this groups imports at the crate level, and directly imports some
imports from their respective crates rather than through an alias of
said crate. a `self` prefix is added to clarify imports from submodules
of this crate.
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(opentelemetry): remove unused `tokio-stream` dependency
Signed-off-by: katelyn martin <kate@buoyant.io>
* chore(opencensus): remove unused `http` dependency
Signed-off-by: katelyn martin <kate@buoyant.io>
* nit(opencensus): use self prefix in import
Signed-off-by: katelyn martin <kate@buoyant.io>
---------
Signed-off-by: katelyn martin <kate@buoyant.io>
Currently, TCP metrics are not logged for HTTP requests coming in through the tagged transport header stack.
This adds that instrumentation, like we do for the opaque and gateway stacks already present.
Signed-off-by: Scott Fleener <scott@buoyant.io>
see https://github.com/linkerd/linkerd2/issues/8733 for more
information.
this commit moves `prost-build` so that it is now managed as a workspace
dependency. while only used in tests, these tests can fail if this is
not versioned in lockstep with our other protobuffer dependencies.
see #3456 (c740b6d8), #3466 (ca50d6bb), and especially #3473 (b87455a9)
for some other previous pr's that moved dependencies to be managed at
the workspace level.
Signed-off-by: katelyn martin <kate@buoyant.io>
see https://github.com/linkerd/linkerd2/issues/8733 for more
information.
we are in the process of upgrading to hyper 1.x.
in the process of doing so, we will wish to use our friendly `BoxBody`
type, which provides a convenient and reusable interface to abstract
over different artitrary `B`-typed request and response bodies.
unfortunately, by virtue of its definition, it is not a `Sync` type:
```rust
pub struct BoxBody {
inner: Pin<Box<dyn Body<Data = Data, Error = Error> + Send + 'static>>,
}
#[pin_project]
pub struct Data {
#[pin]
inner: Box<dyn bytes::Buf + Send + 'static>,
}
```
these are erased `Box<dyn ..>` objects that only ensure `Send`-ness.
rather than changing that, because that is the proper definition of the
type, we should update code in our test client and test server to stop
requesting arbitrary `Sync` bounds.
this commit removes `Sync` bounds from various places that in fact only
need be `Send + 'static`.
this will help facilitate making use of `BoxBody` in #3504.
Signed-off-by: katelyn martin <kate@buoyant.io>
this method is not used by any test code, nor any other internal code.
this commit removes
`linkerd_app_integration::tcp::TcpConn::target_addr()`.
Signed-off-by: katelyn martin <kate@buoyant.io>