carllerche/h2#338 fixes a deadlock in stream reference counts that could
potentially impact the proxy. This branch updates our `h2` dependency to a
version which includes this change.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
An upstream bug in the `trust-dns-proto` library can cause
`trust-dns-resolver` to leak UDP sockets when DNS queries time out. This
issue appears to be the cause of the memory leak described in
linkerd/linkerd2#2012.
This branch updates the `trust-dns` dependency to pick up the change in
bluejekyll/trust-dns#635, which fixes the UDP socket leak.
I confirmed that the socket leak was fixed by modifying the proxy to
hard-code a 0-second DNS timeout, sending requests to the proxy's
outbound listener, and using
``` lsof -p $(pgrep linkerd2-proxy) ```
to count the number of open UDP sockets. On master, every request to a
different DNS name that times out leaves behind an additional open UDP
socket, which show up in `lsof`, while on this branch, only TCP sockets
remain open after the request ends.
In addition, I'm running a test in GCP to watch the memory and file
descriptor use of the proxy over a long period of time. This is still in
progress, but given the above, I strongly believe this branch fixes the
leak.
Fixeslinkerd/linkerd2#2012.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Adopts changes from https://github.com/tower-rs/tower/pull/134
> balance: Consider new nodes more readily
>
> When a PeakEwma Balancer discovers a single new endpoint, it will not
> dispatch requests to the new endpoint until the RTT estimate for an
> existing endpoint exceeds _one second_. This misconfiguration leads to
> unexpected behavior.
>
> When more than one endpoint is discovered, the balancer may eventually
> dispatch traffic to some of--but not all of--the new enpoints.
>
> This change alters the PeakEwma balancer in two ways:
>
> First, the previous DEFAULT_RTT_ESTIMATE of 1s has been changed to be
> configurable (and required). The library should not hard code a default
> here.
>
> Second, the initial RTT value is now decayed over time so that new
> endpoints will eventually be considered, even when other endpoints are
> less loaded than the default RTT estimate.
This untangles some of the HTTP/gRPC glue, providing services/stacks
that have more specific focuses. The `HyperServerSvc` now *only*
converts to a `tower::Service`, and the HTTP/1.1 and Upgrade pieces were
moved to a specific `proxy::http::upgrade::Service`.
Several stack modules were added to `proxy::grpc`, which can map request
and response bodies into `Payload`, or into `grpc::Body`, as needed.
Signed-off-by: Sean McArthur <sean@buoyant.io>
* 80b4ec5 (tag: v0.1.13) Bump version to v0.1.13 (#324)
* 6b23542 Add client support for server push (#314)
* 6d8554a Reassign capacity from reset streams. (#320)
* b116605 Check whether the send side is not idle, not the recv side (#313)
* a4ed615 Check minimal versions (#322)
* ea8b8ac Avoid prematurely unlinking streams in `send_reset`, in some cases. (#319)
* 9bbbe7e Disable length_delimited deprecation warning. (#321)
* 00ca534 Update examples to use new Tokio (#316)
* 12e0d26 Added functions to access io::Error in h2::Error (#311)
* 586106a Fix push promise frame parsing (#309)
* 2b960b8 Add Reset::INTERNAL_ERROR helper to test support (#308)
* d464c6b set deny(warnings) only when cfg(test) (#307)
* b0db515 fix some autolinks that weren't resolving in docs (#305)
* 66a5d11 Shutdown the stream along with connection (#304)
Route labels are not queryable by tap, nor are they exposed to in tap
events.
This change uses the newly-added fields in linkerd/linkerd2-proxy-api#17
to make Tap route-aware.
When the inbound proxy receives requests, these requests may have
relative `:authority` values like _web:8080_. Because these requests can
come from hosts with a variety of DNS configurations, the inbound proxy
can't make a sufficient guess about the fully qualified name (e.g.
_web.ns.svc.cluster.local._).
In order for the inbound proxy to discover inbound service profiles, we
need to establish some means for the inbound proxy to determine the
"canonical" name of the service for each request.
This change introduces a new `l5d-dst-canonical` header that is set by
the outbound proxy and used by the remote inbound proxy to determine
which profile should be used.
The outbound proxy determines the canonical destination by performing
DNS resolution as requests are routed and uses this name for profile and
address discovery. This change removes the proxy's hardcoded Kubernetes
dependency.
The `LINKERD2_PROXY_DESTINATION_GET_SUFFIXES` and
`LINKERD2_PROXY_DESTINATION_PROFILE_SUFFIXES` environment variables
control which domains may be discovered via the destination service.
Finally, HTTP settings detection has been moved into a dedicated routing
layer at the "bottom" of the stack. This is done do that
canonicalization and discovery need not be done redundantly for each set
of HTTP settings. Now, HTTP settings, only configure the HTTP client
stack within an endpoint.
Fixeslinkerd/linkerd2#1798
Since this stack pieces will never error, we can mark their
`Error`s with a type that can "never" be created. When seeing an `Error
= ()`, it can either mean the error never happens, or that the detailed
error is dealt with elsewhere and only a unit is passed on. When seeing
`Error = Never`, it is clearer that the error case never happens.
Besides helping humans, LLVM can also remove the error branchs entirely.
Signed-off-by: Sean McArthur <sean@buoyant.io>
The Destination Profile API---provided by linkerd2-proxy-api v0.1.3--
allows the proxy to discovery route information for an HTTP service. As
the proxy processes outbound requests, in addition to doing address
resolution through the Destination service, the proxy may also discover
profiles including route patterns and labels.
When the proxy has route information for a destination, it applies the
RequestMatch for each route to find the first-matching route. The
route's labels are used to expose `route_`-prefixed HTTP metrics (and
each label is prefixed with `rt_`).
Furthermore, if a route includes ResponseMatches, they are used to
perform classification (i.e. for the `response_total` and
`route_response_total` metrics).
A new `proxy::http::profiles` module implements a router that consumes
routes from an infinite stream of route lists.
The `app::profiles` module implements a client that continually and
repeatedly tries to establish a watch for the destination's routes (with
some backoff).
Route discovery does not _block_ routing; that is, the first request to
a destination will likely be processed before the route information is
retrieved from the controller (i.e. on the default route). Route
configuration is applied in a best-effort fashion.
The controller's client is instantiated in the
`control::destination::background` module and is tightly coupled to its
use for address resolution.
In order to share this client across different modules---and to bring it
into line with the rest of the proxy's modular layout---the controller
client is now configured and instantiated in `app::main`. The
`app::control` module includes additional stack modules needed to
configure this client.
Our dependency on tower-buffer has been updated so that buffered
services may be cloned.
The `proxy::reconnect` module has been extended to support a
configurable fixed reconnect backoff; and this backoff delay has been
made configurable via the environment.
As the proxy's functionality has grown, the HTTP routing functionality
has become complex. Module boundaries have become ill-defined, which
leads to tight coupling--especially around the `ctx` metadata types and
`Service` type signatures.
This change introduces a `Stack` type (and subcrate) that is used as the
base building block for proxy functionality. The `proxy` module now
exposes generic components--stack layers--that are configured and
instantiated in the `app::main` module.
This change reorganizes the repo as follows:
- Several auxiliary crates have been split out from the `src/` directory
into `lib/fs-watch`, `lib/stack` and `lib/task`.
- All logic specific to configuring and running the linkerd2 sidecar
proxy has been moved into `src/app`. The `Main` type has been moved
from `src/lib.rs` to `src/app/main.rs`.
- The `src/proxy` has reusable, generic components useful for building
proxies in terms of `Stack`s.
The logic contained in `lib/bind.rs`, pertaining to per-endpoint service
behavior, has almost entirely been moved into `app::main`.
`control::destination` has changed so that it is not responsible for
building services. (It used to take a clone of `Bind` and use it to
create per-endpoint services). Instead, the destination service
implements the new `proxy::Resolve` trait, which produces an infinite
`Resolution` stream for each lookup. This allows the `proxy::balance`
module to be generic over the servie discovery source.
Furthermore, the `router::Recognize` API has changed to only expose a
`recgonize()` method and not a `bind_service()` method. The
`bind_service` logic is now modeled as a `Stack`.
The `telemetry::http` module has been replaced by a
`proxy::http::metrics` module that is generic over its metadata types
and does not rely on the old telemetry event system. These events are
now a local implementation detail of the `tap` module.
There are no user-facing changes in the proxy's behavior.
This branch changes the proxy's `trust-dns-resolver` dependency to a
version dependency rather than a Git dependency, since the
`0.10.0-alpha.3` version has the features that we previously required
the git dependency for.
The only changes to the proxy codebase itself were fixes for deprecation
warnings introduced by the dependency upgrade, since it was necessary to
update the minimum `tokio_timer` version as `trust-dns-proto` uses APIs
added in `tokio-timer` v0.2.6. In particular, `tokio_timer::Deadline`
was deprecated and replaced by `Timeout`.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Fix linkerd2-metrics test compilation
The `quickcheck` dependency was lost when the subcrate was split out.
This change restores the dependency.
Fixeslinkerd/linkerd2#1685
* Run tests for all packages in `make test`
The `timeout` module has very little to do with the proxy, specifically.
This change moves it into a dedicated subcrate. This helps to clarify
dependencies and to minimize generic logic in the main proxy crate.
In doing this, an unused implementation of AsyncRead/AsyncWrite for
Timeout was deleted.
Furthermore, the `HumanDuration` type has been copied into
tests/support/mod.rs so that this type need not be part of the timeout
module's public API.
The `metrics!` macro is currently local to the telemetry module.
Furthermore, the `telemetry::metrics` module no longer has
proxy-specific logic.
This change moves the `telemetry::metrics` module into a new crate,
`linkerd2_metrics`.
This will enable unifying `telemetry::http` and `telemetry::transport`
into `http` and `transport`, respectively.
This branch should not make any functional changes.
This branch makes two minor refactorings to the `client` module in
`control::destination::background`:
1. Remove the `AddOrigin` middleware and replace it with the
`tower-add-origin` crate from `tower-http`. These middlewares are
functionally identical, but the Tower version has tests.
2. Change `ClientService` from a type alias to a tuple struct. This
means that some of the middleware that are used only in this module
(`LogErrors` and `Backoff`) are no longer part of a publicly visible
type and can be made private to the module.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
When the destination service returns a hint that an endpoint is another
proxy, eligible HTTP1 requests are translated into HTTP2 and sent over
an HTTP2 connection. The original protocol details are encoded in a
header, `l5d-orig-proto`. When a proxy receives an inbound HTTP2
request with this header, the request is translated back into its HTTP/1
representation before being passed to the internal service.
Signed-off-by: Sean McArthur <sean@buoyant.io>
This branch updates the proxy's `h2` dependency to v0.1.11. This version
removes a busy loop when shutting down an idle server
(carllerche/h2#296), and fixes a potential panic when dropping clients
(carllerche/h2#295).
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Specifically proxied bodies would make use of an optimization in hyper
that resulted in the connection not knowing (but it did know! just
didn't tell itself...) that the body was finished, and so the connection
was closed. 0.12.7 includes the fix in hyper.
As part of this upgrade, the keep-alive tests have been adjusted to send
a small body, since the empty body was not triggering this case.
Now that inotify-rs/inotify#105 has merged, we will no longer see
rampant CPU use from using the master version of `inotify`. I've
updated Cargo.toml to depend on master rather than on my branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The `inotify-rs` library's `EventStream` implementation currently
calls `task::current().notify()` in a hot loop when a poll returns
`WouldBlock`, causing the task to constantly burn CPU.
This branch updates the `inotify-rs` dependency to point at a branch
of `inotify-rs` I had previously written. That branch rewrites the
`EventStream` to use `mio` to register interest in the `inotify` file
descriptor instead, fixing the out-of-control polling.
When inotify-rs/inotify#105 is merged upstream, we can go back to
depending on the master version of the library.
Fixes#1261
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
When the proxy receives a `CONNECT` request, the HTTP Upgrade pieces
are used since a CONNECT is very similar to an Upgrade. If the CONNECT
response back from the proxied client request is successful, the
connection is converted into a TCP proxy, just like with Upgrades.
There are currently two issues which can lead to false positives (changes being
reported when files have not actually changed) in the polling-based filesystem
watch implementation.
The first issue is that when checking each watched file for changes, the loop
iterating over each path currently short-circuits as soon as it detects a
change. This means that if two or more files have changed, the first time we
poll the fs, we will see the first change, then if we poll again, we will see
the next change, and so on.
This branch fixes that issue by always hashing all the watched files, even if a
change has already been detected. This way, if all the files change between one
poll and the next, we no longer generate additional change events until a file
actually changes again.
The other issue is that the old implementation would treat any instance of a
"file not found" error as indicating that the file had been deleted, and
generate a change event. This leads to changes repeatedly being detected as
long as a file does not exist, rather than a single time when the file's
existence state actually changes.
This branch fixes that issue as well, by only generating change events on
"file not found" errors if the file existed the last time it was polled.
Otherwise, if a file did not previously exist, we no longer generate a new
event.
I've verified both of these fixes through manual testing, as well as a new
test for the second issue. The new test fails on master but passes on this
branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Proxy: Add parser to distinguish proxy TLS traffic from other traffic.
Distinguish incoming TLS traffic intended for the proxy to terminate
from TLS traffic intended for the proxied service to terminate and from
non-TLS traffic.
The new version of `untrusted` is required for this to work.
Signed-off-by: Brian Smith <brian@briansmith.org>
* More tests
Signed-off-by: Brian Smith <brian@briansmith.org>
* Stop abusing `futures::Async`.
Signed-off-by: Brian Smith <brian@briansmith.org>
This branch adds process stats to the proxy's metrics, as described in
https://prometheus.io/docs/instrumenting/writing_clientlibs/#process-metrics.
In particular, it adds metrics for the process's total CPU time, number of
open file descriptors and max file descriptors, virtual memory size, and
resident set size.
This branch adds a dependency on the `procinfo` crate. Since this crate and the
syscalls it wraps are Linux-specific, these stats are only reported on Linux.
On other operating systems, they aren't reported.
Manual testing
Metrics scrape:
```
eliza@ares:~$ curl http://localhost:4191/metrics
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 19
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 45252608
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 12132352
# HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch)
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1529017536
```
Note that the `process_cpu_seconds_total` stat is 0 because I just launched this conduit instance and it's not seeing any load; it does go up after i sent a few requests to it.
Confirm RSS & virtual memory stats w/ `ps`, and get Conduit's pid so we can check the fd stats
(note that `ps` reports virt/rss in kb while Conduit's metrics reports them in bytes):
```
eliza@ares:~$ ps aux | grep conduit | grep -v grep
eliza 16766 0.0 0.0 44192 12956 pts/2 Sl+ 16:05 0:00 target/debug/conduit-proxy
```
Count conduit process's open fds:
```
eliza@ares:~$ cd /proc/16766/fd
eliza@ares:/proc/16766/fd$ ls -l | wc -l
18
```
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Any HTTP/1.1 requests seen by the proxy will automatically set up
to prepare such that if the proxied responses agree to an upgrade,
the two connections will converted into a standard TCP proxy duplex.
Implementation
-----------------
This adds a new type, `transparency::Http11Upgrade`, which is a sort of rendezvous type for triggering HTTP/1.1 upgrades. In the h1 server service, if a request looks like an upgrade (`h1::wants_upgrade`), the request body is decorated with this new `Http11Upgrade` type. It is actually a pair, and so the second half is put into the request extensions, so that the h1 client service may look for it right before serialization. If it finds the half in the extensions, it decorates the *response* body with that half (if it looks like a response upgrade (`h1::is_upgrade`)).
The `HttpBody` type now has a `Drop` impl, which will look to see if its been decorated with an `Http11Upgrade` half. If so, it will check for hyper's new `Body::on_upgrade()` future, and insert that into the half.
When both `Http11Upgrade` halves are dropped, its internal `Drop` will look to if both halves have supplied an upgrade. If so, the two `OnUpgrade` futures from hyper are joined on, and when they succeed, a `transparency::tcp::duplex()` future is created. This chain is spawned into the default executor.
The `drain::Watch` signal is carried along, to ensure upgraded connections still count towards active connections when the proxy wants to shutdown.
Closes#195
Using MS Edge and probably other clients with the Conduit proxy when
TLS is enabled fails because Rustls doesn't take into consideration
that Conduit only supports one signature scheme (ECDSA P-256 SHA-256).
This bug was fixed in Rustls when ECDSA support was added, after the
latest release. With this change MS Edge can talk to Conduit.
Signed-off-by: Brian Smith <brian@briansmith.org>
This PR changes the proxy's Inotify watch code to avoid always falling back to
polling the filesystem when the watched files don't exist yet. It also contains
some additional cleanup and refactoring of the inotify code, including moving
the non-TLS-specific filesystem watching code out of the `tls::config` module
and into a new `fs_watch` module.
In addition, it adds tests for both the polling-based and inotify-based watch
implementations, and changes the polling-based watches to hash the files rather
than using timestamps from the file's metadata to detect changes. These changes
are originally from #1094 and #1091, respectively, but they're included here
because @briansmith asked that all the changes be made in one PR.
Closes#1094. Closes#1091. Fixes#1090. Fixes#1097. Fixes#1061.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
prost-0.4.0 has been released, which removes unnecessary dependencies.
tower-grpc is being updated simultaneously, as this is the proxy's
primary use of prost.
See: https://github.com/danburkert/prost/releases/tag/v0.4.0
* proxy: Update `rand` to 0.5.1
The proxy depends on rand-0.4, which is superceded by newer APIs in
rand-0.5. Since we're already using rand-0.5 via the tower-balance
crate, it seems appropriate to upgrade the proxy.
* Expand lock files in reviews
In e2093e3, we created a `convert` crate when refactoring the proxy's
gRPC bindings into a dedicated crate.
It's not really necessary to handle `convert` as a crate, given that it
holds a single 39-line file that's mostly comments. It's possible to
"vendor" this file in the proxy, and controller-grpc crate doesn't
even need this trait (in fact, the proxy probably doesn't either).