When the inbound proxy receives requests, these requests may have
relative `:authority` values like _web:8080_. Because these requests can
come from hosts with a variety of DNS configurations, the inbound proxy
can't make a sufficient guess about the fully qualified name (e.g.
_web.ns.svc.cluster.local._).
In order for the inbound proxy to discover inbound service profiles, we
need to establish some means for the inbound proxy to determine the
"canonical" name of the service for each request.
This change introduces a new `l5d-dst-canonical` header that is set by
the outbound proxy and used by the remote inbound proxy to determine
which profile should be used.
The outbound proxy determines the canonical destination by performing
DNS resolution as requests are routed and uses this name for profile and
address discovery. This change removes the proxy's hardcoded Kubernetes
dependency.
The `LINKERD2_PROXY_DESTINATION_GET_SUFFIXES` and
`LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` environment variables
control which domains may be discovered via the destination service.
Finally, HTTP settings detection has been moved into a dedicated routing
layer at the "bottom" of the stack. This is done do that
canonicalization and discovery need not be done redundantly for each set
of HTTP settings. Now, HTTP settings, only configure the HTTP client
stack within an endpoint.
Fixeslinkerd/linkerd2#1798
As the proxy's functionality has grown, the HTTP routing functionality
has become complex. Module boundaries have become ill-defined, which
leads to tight coupling--especially around the `ctx` metadata types and
`Service` type signatures.
This change introduces a `Stack` type (and subcrate) that is used as the
base building block for proxy functionality. The `proxy` module now
exposes generic components--stack layers--that are configured and
instantiated in the `app::main` module.
This change reorganizes the repo as follows:
- Several auxiliary crates have been split out from the `src/` directory
into `lib/fs-watch`, `lib/stack` and `lib/task`.
- All logic specific to configuring and running the linkerd2 sidecar
proxy has been moved into `src/app`. The `Main` type has been moved
from `src/lib.rs` to `src/app/main.rs`.
- The `src/proxy` has reusable, generic components useful for building
proxies in terms of `Stack`s.
The logic contained in `lib/bind.rs`, pertaining to per-endpoint service
behavior, has almost entirely been moved into `app::main`.
`control::destination` has changed so that it is not responsible for
building services. (It used to take a clone of `Bind` and use it to
create per-endpoint services). Instead, the destination service
implements the new `proxy::Resolve` trait, which produces an infinite
`Resolution` stream for each lookup. This allows the `proxy::balance`
module to be generic over the servie discovery source.
Furthermore, the `router::Recognize` API has changed to only expose a
`recgonize()` method and not a `bind_service()` method. The
`bind_service` logic is now modeled as a `Stack`.
The `telemetry::http` module has been replaced by a
`proxy::http::metrics` module that is generic over its metadata types
and does not rely on the old telemetry event system. These events are
now a local implementation detail of the `tap` module.
There are no user-facing changes in the proxy's behavior.
When the destination service returns a hint that an endpoint is another
proxy, eligible HTTP1 requests are translated into HTTP2 and sent over
an HTTP2 connection. The original protocol details are encoded in a
header, `l5d-orig-proto`. When a proxy receives an inbound HTTP2
request with this header, the request is translated back into its HTTP/1
representation before being passed to the internal service.
Signed-off-by: Sean McArthur <sean@buoyant.io>
Required for linkerd/linkerd2#1322.
Currently, the proxy places a limit on the number of active routes
in the route cache. This limit defaults to 100 routes, and is intended
to prevent the proxy from requesting more than 100 lookups from the
Destination service.
However, in some cases, such as Prometheus scraping a large number of
pods, the proxy hits this limit even though none of those requests
actually result in requests to service discovery (since Prometheus
scrapes pods by their IP addresses).
This branch implements @briansmith's suggestion in
https://github.com/linkerd/linkerd2/issues/1322#issuecomment-407161829.
It splits the router capacity limit to two separate, configurable
limits, one that sets an upper bound on the number of concurrently
active destination lookups, and one that limits the capacity of the
router cache.
I've done some preliminary testing using the `lifecycle` tests, where a
single Prometheus instance is configured to scrape a very large number
of proxies. In these tests, neither limit is reached. Furthermore, I've added
integration tests in `tests/discovery` to exercise the destination service
query limit. These tests ensure that query capacity is released when inactive
routes which create queries are evicted from the router cache, and that the
limit does _not_ effect DNS queries.
This branch obsoletes and closes#27, which contained an earlier version of
these changes.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The `Backoff` service wrapper is used for the controller client service
so that if the proxy can't find the controller (there is a connection
error), it doesn't keep trying in a tight loop, but instead waits a
couple seconds before trying again, presuming that the control plane
was rebooting.
When "backing off", a timer would be set, but it wasn't polled, so the
task was never registered to wake up after the delay. This turns out to
not have been a problem in practice, since the background destination
task was joined with other tasks that were constantly waking up,
allowing it to try again anyways.
To add tests for this, a new `ENV_CONTROL_BACKOFF_DELAY` config value
has been added, so that the tests don't have to wait the default 5
seconds.
Signed-off-by: Sean McArthur <sean@buoyant.io>