Since this stack pieces will never error, we can mark their
`Error`s with a type that can "never" be created. When seeing an `Error
= ()`, it can either mean the error never happens, or that the detailed
error is dealt with elsewhere and only a unit is passed on. When seeing
`Error = Never`, it is clearer that the error case never happens.
Besides helping humans, LLVM can also remove the error branchs entirely.
Signed-off-by: Sean McArthur <sean@buoyant.io>
The router's `Recognize` trait is now essentially a function.
This change provides an implementation of `Recognize` over a `Fn` so
that it's possible to implement routers without defining 0-point marker
types that implement `Recognize`.
The `linkerd2_stack::Either` type is used to implement Layer, Stack, and
Service for alternate underlying implementations. However, the Service
implementation requires that both inner services emit the same type of
Error.
In order to allow the underlying types to emit different errors, this
change uses `Either` to wrap the underlying errors, and implements
`Error` for `Either`.
It was possible for a metrics scope to be deregistered for active
routes. This could cause metrics to disappear and never be recorded in
some situations.
This change ensure that metrics are only evicted for scopes that are not
active (i.e. in a router, load balancer, etc).
With the introduction of profile-based classification, the proxy would
not perform normal gRPC classification in some cases when it could &
should.
This change simplifies our default classifier logic and falls back to
the default grpc-aware behavior whenever another classification cannot
be performed.
Furthermore, this change moves the `proxy::http::classify` module to
`proxy::http::metrics::classify`, as these modules should only be relied
on for metrics classification. Other module (for instance, retries),
should provide their own abstractions.
Finally, this change fixes a test error-formatting issue.
Currently, the proxy uses a variety of types to represent the logical
destination of a request. Outbound destinations use a `NameAddr` type
which may be either a `DnsNameAndPort` or a `SocketAddr`. Other parts of
the code used a `HostAndPort` enum that always contained a port and also
contained a `Host` which could either be a `dns::Name` or a `IpAddr`.
Furthermore, we coerce these types into a `http::uri::Authority` in many
cases.
All of these types represent the same thing; and it's not clear when/why
it's appropriate to use a given variant.
In order to simplify the situtation, a new `addr` module has been
introduced with `Addr` and `NameAddr` types. A `Addr` may
contain either a `NameAddr` or a `SocketAddr`.
The `Host` value has been removed from the `Settings::Http1` type,
replaced by a boolean, as it's redundant information stored elsewhere in
the route key.
There is one small change in behavior: The `authority` metrics label is
now omitted only for requests that include an `:authority` or `Host`
with a _name_ (i.e. and not an IP address).
The Destination Profile API---provided by linkerd2-proxy-api v0.1.3--
allows the proxy to discovery route information for an HTTP service. As
the proxy processes outbound requests, in addition to doing address
resolution through the Destination service, the proxy may also discover
profiles including route patterns and labels.
When the proxy has route information for a destination, it applies the
RequestMatch for each route to find the first-matching route. The
route's labels are used to expose `route_`-prefixed HTTP metrics (and
each label is prefixed with `rt_`).
Furthermore, if a route includes ResponseMatches, they are used to
perform classification (i.e. for the `response_total` and
`route_response_total` metrics).
A new `proxy::http::profiles` module implements a router that consumes
routes from an infinite stream of route lists.
The `app::profiles` module implements a client that continually and
repeatedly tries to establish a watch for the destination's routes (with
some backoff).
Route discovery does not _block_ routing; that is, the first request to
a destination will likely be processed before the route information is
retrieved from the controller (i.e. on the default route). Route
configuration is applied in a best-effort fashion.
As described in https://github.com/linkerd/linkerd2/issues/1832, our eager
classification is too complicated.
This changes the `classification` label to only be used with the `response_total` label.
The following changes have been made:
1. response_latency metrics only include a status_code and not a classification.
2. response_total metrics include classification labels.
3. transport metrics no longer expose a `classification` label (since it's misleading).
now the `errno` label is set to be empty when there is no error.
4. Only gRPC classification applies when the request's content type starts
with `application/grpc+`
The `proxy::http::classify` APIs have been changed so that classifiers cannot
return a classification before the classifier is fully consumed.
The controller's client is instantiated in the
`control::destination::background` module and is tightly coupled to its
use for address resolution.
In order to share this client across different modules---and to bring it
into line with the rest of the proxy's modular layout---the controller
client is now configured and instantiated in `app::main`. The
`app::control` module includes additional stack modules needed to
configure this client.
Our dependency on tower-buffer has been updated so that buffered
services may be cloned.
The `proxy::reconnect` module has been extended to support a
configurable fixed reconnect backoff; and this backoff delay has been
made configurable via the environment.
When a gRPC service fails a request eagerly, before it begins sending a
response, a `grpc-status` header is simply added to the initial response
header (rather than added to trailers).
This change ensures that classification honors these status codes.
Fixeslinkerd/linkerd2#1819
Previously, stacks were built with `Layer::and_then`. This pattern
severely impacts compile-times as stack complexity grows.
In order to ameliorate this, `app::main` has been changed to build
stacks from the "bottom" (endpoint client) to "top" (serverside
connection) by _push_-ing Layers onto a concrete stack, i.e. and not
composing layers for an abstract stack.
While doing this, we take the oppportunity to remove a ton of
now-unnecessary `PhantomData`. A new, dedicated `phantom_data` stack
module can be used to aid type inference as needed.
Other stack utilities like `map_target` and `map_err` have been
introduced to assist this transition.
Furthermore, all instances of `Layer::new` have been changed to a free
`fn layer` to improve readability.
This change sets up two upcoming changes: a stack-oriented `controller`
client and, subsequently, service-profile-based routing.
* Prepare HTTP metrics for per-route classification
In order to support Service Profiles, the proxy will add a new scope of
HTTP metrics prefixed with `route_`, i.e. so that the proxy exposes
`request_total` and `route_request_total` independently.
Furthermore, the proxy must be able to use different
response-classification logic for each route, and this classification
logic should apply to both metrics scopes.
This alters the `proxy::http::metrics` module so that:
1. HTTP metrics may be scoped with a prefix (as the stack is described).
2. The HTTP metrics layer now discovers the classifier by trying to
extract it from each request's extensions or fall back to a `Default`
implementation. Only a default implementation is used presently.
3. It was too easy to use the `Classify` trait API incorrectly.
Non-default classify implementation could cause a runtime panic!
The API has been changed so that the type system ensures correct
usage.
4. The HTTP classifier must be configurable per-request. In order to do
this, we expect a higher stack layer will add response classifiers to
request extensions when appropriate (i.e., in a follow-up).
Finally, the `telemetry::Report` type requires updating every time a new
set of metrics is added. We don't need a struct to represent this.
`FmtMetrics::and_then` has been added as a combinator so that a fixed
type is not necessary.
Previously, stacks were built with `Layer::and_then`. This pattern
severely impacts compile-times as stack complexity grows.
In order to ameliorate this, `app::main` has been changed to build
stacks from the "bottom" (endpoint client) to "top" (serverside
connection) by _push_-ing Layers onto a concrete stack, i.e. and not
composing layers for an abstract stack.
While doing this, we take the oppportunity to remove a ton of
now-unnecessary `PhantomData`. A new, dedicated `phantom_data` stack
module can be used to aid type inference as needed.
Other stack utilities like `map_target` and `map_err` have been
introduced to assist this transition.
Furthermore, all instances of `Layer::new` have been changed to a free
`fn layer` to improve readability.
This change sets up two upcoming changes: a stack-oriented `controller`
client and, subsequently, service-profile-based routing.
The `proxy::http::balance` module uses the `proxy::resolve::Resolve`
trait to implement a `Discover`.
This coupling between the balance and resolve modules prevents
integrating the destination profile API such that there is a per-route,
per-endpoint stack.
This change makes the `balance` stack generic over a stack that produces
a `Discover`. The `resolve` module now implements a stack that produces
a `Discover` and is generic over a per-endpoint stack.
The control client implements a backoff service that dampens reconnect
attempts to the control plane by waiting a fixed period of time after a
failure.
Furthermore, the control client logs errors each time a reconnect
attempt fails.
This change moves backoff logic from
control::destination::background::client to proxy::reconnect.
Because the reconnect module handles connection errors uniformly, muting
repeated errors, it also has enough context to know when a backoff
should be applied -- when the underlying NewService cannot produce a
Service.
If polling the inner service fails once the Service has been
established, we do not want to apply a backoff, since this may
just be the result of a connection being terminated, a process being
restarted, etc.
The TLS-configuration-watching logic in `app::outbound::tls_config` need
not be specific to the outbound types, or even TLS configuration.
Instead, this change extends the `watch` stack module with a Stack type
that can satisfy the TLS use case independently of the concrete types at
play.
The listener *already* got the remote address when it was accepted, but we drop the value by using `TcpListener::incoming`. By the time we call `socket.peer_addr()`, the connection may have been closed, and thus we were panicking.
By removing the panic here, later code should notice that the connection is closed (when a `read` finds EOF), and it should be dropped gracefully.
For the same reasons (that the connection might already be closed), this reduces the `error!` from `get_original_dst` to just a `warn!`, just as `set_nodelay` is a `warn!`. No need to yell in that case.
Closes https://github.com/linkerd/linkerd2/issues/1787
Signed-off-by: Sean McArthur <sean@buoyant.io>
Previously, the `client` module was responsible for instrument
reconnects. Now, the reconnect module becomes its own stack layer that
composes over NewService stacks.
Additionally, the `proxy::http::client` module can now layer over an
underlying Connect stack.
As the proxy's functionality has grown, the HTTP routing functionality
has become complex. Module boundaries have become ill-defined, which
leads to tight coupling--especially around the `ctx` metadata types and
`Service` type signatures.
This change introduces a `Stack` type (and subcrate) that is used as the
base building block for proxy functionality. The `proxy` module now
exposes generic components--stack layers--that are configured and
instantiated in the `app::main` module.
This change reorganizes the repo as follows:
- Several auxiliary crates have been split out from the `src/` directory
into `lib/fs-watch`, `lib/stack` and `lib/task`.
- All logic specific to configuring and running the linkerd2 sidecar
proxy has been moved into `src/app`. The `Main` type has been moved
from `src/lib.rs` to `src/app/main.rs`.
- The `src/proxy` has reusable, generic components useful for building
proxies in terms of `Stack`s.
The logic contained in `lib/bind.rs`, pertaining to per-endpoint service
behavior, has almost entirely been moved into `app::main`.
`control::destination` has changed so that it is not responsible for
building services. (It used to take a clone of `Bind` and use it to
create per-endpoint services). Instead, the destination service
implements the new `proxy::Resolve` trait, which produces an infinite
`Resolution` stream for each lookup. This allows the `proxy::balance`
module to be generic over the servie discovery source.
Furthermore, the `router::Recognize` API has changed to only expose a
`recgonize()` method and not a `bind_service()` method. The
`bind_service` logic is now modeled as a `Stack`.
The `telemetry::http` module has been replaced by a
`proxy::http::metrics` module that is generic over its metadata types
and does not rely on the old telemetry event system. These events are
now a local implementation detail of the `tap` module.
There are no user-facing changes in the proxy's behavior.
This branch changes the proxy's `trust-dns-resolver` dependency to a
version dependency rather than a Git dependency, since the
`0.10.0-alpha.3` version has the features that we previously required
the git dependency for.
The only changes to the proxy codebase itself were fixes for deprecation
warnings introduced by the dependency upgrade, since it was necessary to
update the minimum `tokio_timer` version as `trust-dns-proto` uses APIs
added in `tokio-timer` v0.2.6. In particular, `tokio_timer::Deadline`
was deprecated and replaced by `Timeout`.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
There are a few crufty remaining references to the `private` and
`public` proxy interfaces. `private` referred to the pod-local side of
the proxy; and `public` the external side.
These terms have been replaced by `inbound` and `outbound` proxies.
The "private forward address" is now the "inbound forward address".
The "private listener" is now the "outbound listener".
The "public listener" is now the "inbound listener".
This change adds alternate environment variables to support configuration:
- `LINKERD2_PROXY_INBOUND_CONNECT_TIMEOUT`
- `LINKERD2_PROXY_INBOUND_FORWARD`
- `LINKERD2_PROXY_INBOUND_LISTENER`
- `LINKERD2_PROXY_OUTBOUND_CONNECT_TIMEOUT`
- `LINKERD2_PROXY_OUTBOUND_LISTENER`
The old configuration variables have not been removed. Instead, a
warning is logged when such a variable is used to configure the proxy.
* Fix linkerd2-metrics test compilation
The `quickcheck` dependency was lost when the subcrate was split out.
This change restores the dependency.
Fixeslinkerd/linkerd2#1685
* Run tests for all packages in `make test`
As we extract subcrates from the `src/` directory, the repository root
becomes a bit cluttered. This change moves these subcrates into a `lib`
directory.
The `timeout` module has very little to do with the proxy, specifically.
This change moves it into a dedicated subcrate. This helps to clarify
dependencies and to minimize generic logic in the main proxy crate.
In doing this, an unused implementation of AsyncRead/AsyncWrite for
Timeout was deleted.
Furthermore, the `HumanDuration` type has been copied into
tests/support/mod.rs so that this type need not be part of the timeout
module's public API.
Split proxy infrastructure from `bind`
In order to simplify `bind.rs`, this change moves `NormalizeUri` into
its own module, `proxy::http::normalize_uri`.
This will later become part of the "client stack" built from
`proxy::http`.
Previously, `proxy::Server` was generic over a `NewService` (constructed
in `lib.rs`) that instruments error handling around the router and metrics. In
preparation of adding a metrics module into the server stack (that is configured
by the source connection), `Server` should be changed to instantaneously build
clients with a `MakeClient<Arc<ctx::transport::Server>>`.
In order to do this, the following changes were made:
1. The `svc::NewClient` type was changed to `svc::MakeClient<Target>`. The
naming change ("New" to "Make") is intended to differentiate the type from
`NewService`, which is asynchronous and does not accept a `Target` argument.
2. The `proxy::h2_router` module was split from `lib.rs` and `map_err.rs`. `MapErr`
tried to be generic, though we only used it in once place. Now, the `h2_router::Make`
type supports cloning routers and handling their errors.
3. The `TimestampRequestOpen` middleware was split into its own file and given a
`MakeClient` implementation.
4. The `svc::Layer` trait has been introduced to support layering middlewares like
`TimestampRequestOpen`. This is analogous to Finagle's `Stack.Module` type.
There are no functional changes.
This change clarifies the naming and role of the `proxy` (née transparency)
module. There are no functional changes.
`proxy::tcp::Proxy` has been renamed to `proxy::tcp::Forward` to help
disambiguate terminology: TCP connections may be _forwarded_
by the proxy server.
Currently, the layered service implementations that comprise the HTTP
stack are a mix of `Service` and `NewService` types. In the
endpoint-specific stack. `transparency::Client` is the only layer that
actually needs to be a `NewService` if it is wrapped immediately with a
`Reconnect`.
This allows us to remove several `NewService` implementations.
This extracts a new `svc::Reconnect` middleware from `bind`, handling
connection error logging and hiding `tower_reconnect::Error` from outer
layers.
Furthermore, two HTTP/1-specific middlewares have been moved outside of
the TLS rebinding layer, since they are not dependent on TLS
configuration.
Finally, `bind`'s type aliases have been simplified, removing the
`HttpRequest` and `HttpResponse` aliases. By removing these, and
removing `transparency::Client`'s dependency on the telemetry body
types, it should be easier to change type signatures going forward.
`bind::BoundService` wraps a `Reconnect` service and handles its Connect
errors. However, `BoundService` exposes `Reconnect`'s Error type to
callers even though these errors can never be returned.
Furthermore, `Reconnect` is allowed be polled after returning an error,
triggering the inner service to be rebuilt. We needlessly duplicate this
logic in `BoundService`.
Before splitting this file up into smaller chunks, let's update
`BoundService` to more narrowly adhere to `Reconnect`s API:
- Only the inner error type is returned. `unreachable!` assertions
have been made where error variants cannot be returned.
- Do not "rebind" the stack explicitly. Instead, let `Reconnect` do
this.
- Now BoundService::call may panic if invoked before poll_ready. It's a
programming error, since `Reconnect` requires that `poll_ready` be
called first.
The `control::destination` exposes an important trait, `Bind`, that
abstracts the logic of instantiating a new service for an individual
endpoint (i.e., in a load balancer).
This interface is not specific to our service discovery implementation,
and can easily be used to model other types of client factory.
In the spirit of consolidating our HTTP-specific logic, and making the
core APIs of the proxy more visible, this change renames the `Bind`
trait to `NewClient`, simplifies the trait to have fewer type
parameters, and documents this new generalized API.
Following #84, the `telemetry::transport` module can be moved into the
`transport` module.
This should allow us to simplify type signatures by combining redundant
types. It's also hoped that we can reduce the API boilerplate around
metrics so it's much easier to instrument and track new metrics in
transport code.
The `metrics!` macro is currently local to the telemetry module.
Furthermore, the `telemetry::metrics` module no longer has
proxy-specific logic.
This change moves the `telemetry::metrics` module into a new crate,
`linkerd2_metrics`.
This will enable unifying `telemetry::http` and `telemetry::transport`
into `http` and `transport`, respectively.
The metrics Service implementation that renders prometheus metrics can
be used independently of any specific listener.
This change moves the binding and listening details into the `control`
module, as this seems like the best umbrella for the specifics of
serving things to the control plane.
Now that transport details have been separated into modules, the
`metrics::Root` type makes more sense as a `telemetry::Report` type.
With this change, the `telemetry::metrics` module holds only the
abstract structural details of metrics reporting.
To this end:
- `metrics::Root` is now `telemetry::Report`
- `metrics::Serve` is now generic over `FmtMetrics`. It's only an
implementation detail that the `telemetry::Report` type is used.
- all _Report_ types now implement `Clone` so that the main report
can be cloned for each connection (i.e. from prometheus).
The metrics server is responsible for evicting unused metrics, which
seems like an unnecessary coupling with the storage implementation.
This change moves this logic into `telemetry::http` so that the
eviction strategy is specific to the implementation.
Now that the metrics structure is shared internally to `http`,
`Report`'s implementation of `FmtMetrics` can evict expired metrics.
There are no functional changes.
Previously, much of `telemetry::http`'s types and internal
implementation details are exposed to the rest of the telemetry system.
In preparation for further changes to support more granular locking,
this change makes metric storage and recording implementation details.
Following this, `telemetry::http` exposes a `Report` type for printing
metrics to the server and a `Sensors` type used to instrument stacks
with HTTP telemetry. These types share an internally-mutable metrics
registry that is private to the http module.
The `event` types continue to be exposed to support Tap, but the
convenience exports have been removed.
The `metrics::Root` type no longer needs to be shareable. This type will
be replaced in a followup change.
In preparation for further simplifications to HTTP telemetry, this
change consolidates all HTTP-specific logic under the `telemetry::http`
module.
Specifically, the following modules have been moved:
- `telemetry::event`;
- `telemetry::metrics::labels`;
- `telemetry::metrics::record`;
- `telemetry::sensors`; and
- `telemetry::sensors::http`.
This change takes pains to avoid changing any implementation details, so
some types and methods have been made public temporarily while the
interface boundaries are not well defined. This will be fixed in a
subsequent change.
`Bind` was initially written so that a `Sensors` implementation is
optional. Over time, this hasn't proven to be very useful.
In preparation for further changes to HTTP telemetry, this change
simplifies the Bind and Sensors types so that an HTTP sensor is always
required to construct `Bind`.
Test-only constructors have been added to satisfy the case where Sensors
were omitted.
Previousy, transport telemetry was recorded by emitting Events from an
IO instance to an aggegator. This requires that each update take a
global telemetry lock, and is an impediment to richer telemetry.
This change removes the transport event types so that the Event and
Record types are left only to represent HTTP telemetry. Now, the
transport's IO type holds a reference to a shared `Metrics` structure.
As the transport is used, metric values are updated immediately.
A lock on the transport _registry_ is taken whenever a new transport is
opened/accepted and when metrics are reported. Each transport class's
metrics are now shared & locked independently, so it's possible for a
transport to update its metrics while the registry is being manipulated.
This has one functional change: the `tcp_read_bytes_total` and
`tcp_write_bytes_total` counters are now updated instantaneously.
Previously these values were only incremented on transport close, which
is misleading, especially for long-lived connections.
With this change, all transport-related telemetry logic lives in
`telemetry::transport`.
In anticipation of removing Transport-related Event types, we want to
separate the concerns of recording transport metrics updates from
reporting them to the metrics endpoint.
The transport module has been split into `Registry` and `Report` types:
`Registry` is responsible for recording updates, and `Report` is
responsible for rendering metrics.
Following #67 and #68, the `labels::TlsStatus` type can be removed in
favor of extending underlying `ctx::transport::TlsStatus` type to
implement `FmtLabels`.