Commit Graph

286 Commits

Author SHA1 Message Date
Sean McArthur 04e4b4409b update httparse to v1.3.2
Allows using SIMD instructions when parsing.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-19 16:12:35 -07:00
Sean McArthur 11e7eb6357 update hyper to v0.12.6
Brings in fix to reduce connection churn related to keep-alive racing.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-19 16:12:35 -07:00
Eliza Weisman 30a48a7d8b
Accept TLS connnections even when TLS configuration isn't available (#22)
Closes linkerd/linkerd2#1272.

Currently, if TLS is enabled but the TLS configuration isn't available
(yet), the proxy will pass through all traffic to the application.
However, the destination service will tell other proxies to send TLS
traffic to the pod unconditionally, so the proxy will pass through TLS
handshakes to the application that are destined for the proxy itself.

In linkerd/linkerd2#1272, @briansmith suggested that we change the 
proxy so that when it hasn't yet loaded a TLS configuration, it will
accept TLS handshakes, but fail them. This branch implements that 
behaviour by making the `rustls::sign::CertifiedKey` in `CertResolver`
optional, and changing the `CertResolver` to return `None` when 
`rustls` asks it to resolve a certificate in that case. The server
config watch is now initially created with `Conditional::Some` with an
empty `CertResolver`, rather than `Conditional::None(NoConfig)`, so
that the proxy will accept incoming handshakes, but fail them.

Signed-off-by: Eliza Weisman <eliza@buoyant.io
2018-07-18 17:06:45 -07:00
Eliza Weisman f208acb3a5
Fix incorrect process_cpu_seconds_total metric (#7)
Fixes linkerd/linkerd2#1239.

The proxy's `process_cpu_seconds_total` metric is currently calculated
incorrectly and will differ from the CPU stats reported by other 
sources. This is because it currently calculates the CPU time by summing
the `utime` and `stime` fields of the stat struct returned by `procinfo`.
However, those numbers are expressed in _clock ticks_, not seconds, so
the metric is expressed in the wrong unit.

This branch fixes this issue by using `sysconf` to get the number of
clock ticks per second when the process sensor is created, and then
dividing `utime + stime` by that number, so that the value is expressed
in seconds.

## Demonstration:

(Note that the last column in `ps aux` output is the CPU time total)
```
eliza@ares:~$ ps aux | grep linkerd2-proxy | grep -v grep
eliza    40703  0.2  0.0  45580 14864 pts/0    Sl+  13:49   0:03 target/debug/linkerd2-proxy
eliza@ares:~$ curl localhost:4191/metrics
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 3
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 19
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 46673920
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 15220736
# HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch)
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1531428584
eliza@ares:~$
```

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-17 15:48:02 -07:00
Brian Smith 1fdcbfaba6
Replace "conduit" with "linkerd" in TLS test data. (#17)
This is purely aesthetic; the TLS logic doesn't care about the product name.

The test data was regenerated by re-running gen-certs.sh after modifying it.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-07-17 09:22:17 -10:00
Brian Smith e1b4e66836
Upgrade TLS dependencies. (#16)
Fixes linkerd/linkerd2#1330.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-07-17 09:21:59 -10:00
Sean McArthur 162f53dc8d
spawn individual admin tasks instead of joining all (#10)
Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-17 11:50:57 -07:00
Eliza Weisman 2ac114ba65
Point inotify dependency at master (#14)
Now that inotify-rs/inotify#105 has merged, we will no longer see
rampant CPU use from using the master version of `inotify`. I've 
updated Cargo.toml to depend on master rather than on my branch.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-17 10:21:40 -07:00
Sean McArthur aa60ddb088
remove busy loop from destination background future when shutdown (#15)
When the proxy is shutting down, once there are no more outbound
connections, the sender side of the resolver channel is dropped. In the
admin background thread, when the destination background future is
notified of the closure, instead of shutting down itself, it just busy
loops. Now, after seeing shutdown, the background future ends as well.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-17 09:50:08 -07:00
Sean McArthur 9f5648d955
fix control client Backoff to poll its timer when backing off (#13)
The `Backoff` service wrapper is used for the controller client service
so that if the proxy can't find the controller (there is a connection
error), it doesn't keep trying in a tight loop, but instead waits a
couple seconds before trying again, presuming that the control plane
was rebooting.

When "backing off", a timer would be set, but it wasn't polled, so the
task was never registered to wake up after the delay. This turns out to
not have been a problem in practice, since the background destination
task was joined with other tasks that were constantly waking up,
allowing it to try again anyways.

To add tests for this, a new `ENV_CONTROL_BACKOFF_DELAY` config value
has been added, so that the tests don't have to wait the default 5
seconds.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-16 12:41:47 -07:00
Brian Smith 73edefb795
Move `connection` submodule to `transport`. (#11)
This allows easier logging configuration for the entire transport system
using the common prefix `conduit_proxy::transport`. Previously logging had to be
controlled separately/additionally for `conduit_proxy::connection`.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-07-13 12:46:38 -10:00
Sean McArthur f7233fd682 Revert "remove busy loop from destination background future during shutdown (#8)"
This reverts commit 4bee7b0b55.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-13 13:18:39 -07:00
Oliver Gould bbf217ff4f
Replace references to _Conduit_ (#6)
There are various comments, examples, and documentation that refers to
Conduit. This change replaces or removes these refernces.

CONTRIBUTING.md has been updated to refer to GOVERNANCE/MAINTAINERS.
2018-07-12 20:41:17 -07:00
Eliza Weisman 2f4c1b220a
Add labels for `tls::ReasonForNoIdentity` (#5)
Fixes linkerd/linkerd2#1276.

Currently, metrics with the `tls="no_identity"` label are duplicated.
This is because that label is generated from the `tls_status` label on
the `TransportLabels` struct, which is either `Some(())` or a
`ReasonForNoTls`. `ReasonForNoTls` has a
variant`ReasonForNoTls::NoIdentity`, which contains a
`ReasonForNoIdentity`, but when we format that variant as a label, we
always just produce the string "no_identity", regardless of the value of
the `ReasonForNoIdentity`. 

However, label types are _also_ used as hash map keys into the map that
stores the metrics scopes, so although two instances of
`ReasonForNoTls::NoIdentity` with different `ReasonForNoIdentity`s
produce the same formatted label output, they aren't _equal_, since that
field differs, so they correspond to different metrics.

This branch resolves this issue, by adding an additional label to these
metrics, based on the `ReasonForNoIdentity`. Now, the separate lines in
the metrics output that correspond to each `ReasonForNoIdentity` have a
label differentiating them from each other.

Note that the the `NotImplementedForTap` and `NotImplementedForMetrics`
reasons will never show up in metrics labels, currently, since we don't gather 
metrics from the tap and metrics servers at the moment.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-12 16:04:25 -07:00
Sean McArthur 4bee7b0b55
remove busy loop from destination background future during shutdown (#8)
When the proxy is shutting down, once there are no more outbound
connections, the sender side of the resolver channel is dropped. In the
admin background thread, when the destination background future is
notified of the closure, instead of shutting down itself, it just busy
loops. Now, after seeing shutdown, the background future ends as well.

While examining this, I noticed all the background futures are joined
togther into a single `Future` before being spawned on a dedicated
current_thread executor. Join in this case is inefficient, since *every*
single time *one* of the futures is ready, they are *all* polled again.
Since we have an executor handy, it's better to allow it to manage each
of the futures individually.

Signed-off-by: Sean McArthur <sean@buoyant.io>
2018-07-12 15:22:32 -07:00
Oliver Gould 3c48ba7f62
Add a README (#4) 2018-07-11 16:01:54 -07:00
Oliver Gould 8db765c7bc
dev: Add a Dockerfile for development (#3)
When working on the proxy, it's important to be able to build a Docker
image that can be tested in the context of the existing linkerd2
project.

This change adds a `make docker` target that produces a docker image,
optionally tagged via the `DOCKER_TAG` environment variable.

This is intended to be used for development--especially on non-Linux
OSes.
2018-07-11 15:27:33 -07:00
Oliver Gould 0ca5d11c03
Adopt Linkerd2's governance (#2)
For the time being, @briansmith and I will serve as super-maintainers
for the linkerd2-proxy.
2018-07-10 15:59:12 -07:00
Oliver Gould 02a64e980f
ci: Publish artifacts to build.l5d.io
In order to setup continuous integration, proxy artifacts need to be
published somewhere predictable and discoverable. This change configures
Travis CI to publish proxy artifacts built from master to:

    build.l5d.io/linkerd2-proxy/linkerd2-proxy-${ref}.tar.gz
    build.l5d.io/linkerd2-proxy/linkerd2-proxy-${ref}.txt

The tarball includes an optimized proxy binary and metadata (right now, just
the LICENSE file, but later this should include additional version/build
metadata that can be used for runtime diagnostics).

The text file includes the sha256 sum of the tarball.

A `Makefile` is introduced to encapsulate build logic so that it can both
drive CI and be used manually.

Travis CI is configured to run debug-mode tests against PRs and to run a full
release package-test-publish for commits to
master.
2018-07-08 14:24:25 -07:00
Oliver Gould c23ecd0cbc
Migrate `conduit-proxy` to `linkerd2-proxy`
The proxy now honors environment variables starting with
`LINKERD2_PROXY_`.
2018-07-07 22:45:21 +00:00
Eliza Weisman ec303942ee proxy: Add tls_config_last_reload_seconds metric (#1204)
Depends on #1141.

This PR adds a `tls_config_last_reload_seconds` Prometheus metric
that reports the last time the TLS configuration files were reloaded.

Proof that it works:

Started the proxy with no certs, then generated them:
```
➜ http GET localhost:4191/metrics
HTTP/1.1 200 OK
content-encoding: gzip
content-length: 323
content-type: text/plain
date: Mon, 25 Jun 2018 23:02:52 GMT

# HELP tls_config_reload_total Total number of times the proxy's TLS config files were reloaded.
# TYPE tls_config_reload_total counter
tls_config_reload_total{status="io_error",path="example-example.crt",error_code="2"} 9
tls_config_reload_total{status="reloaded"} 3
# HELP tls_config_last_reload_seconds Timestamp of when the TLS configuration files were last reloaded successfully (in seconds since the UNIX epoch)
# TYPE tls_config_last_reload_seconds gauge
tls_config_last_reload_seconds 1529967764
# HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch)
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1529967754
```

Started the proxy with certs already present:
```
➜ http GET localhost:4191/metrics
HTTP/1.1 200 OK
content-encoding: gzip
content-length: 285
content-type: text/plain
date: Mon, 25 Jun 2018 23:04:39 GMT

# HELP tls_config_reload_total Total number of times the proxy's TLS config files were reloaded.
# TYPE tls_config_reload_total counter
tls_config_reload_total{status="reloaded"} 4
# HELP tls_config_last_reload_seconds Timestamp of when the TLS configuration files were last reloaded successfully (in seconds since the UNIX epoch)
# TYPE tls_config_last_reload_seconds gauge
tls_config_last_reload_seconds 1529967876
# HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch)
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1529967874
```

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-05 16:23:57 -07:00
Eliza Weisman dd7ac18cc5 proxy: Fix out-of-control inotify CPU use (#1263)
The `inotify-rs` library's `EventStream` implementation currently 
calls `task::current().notify()` in a hot loop when a poll returns
`WouldBlock`, causing the task to constantly burn CPU. 

This branch updates the `inotify-rs` dependency to point at a branch
of `inotify-rs` I had previously written. That branch  rewrites the 
`EventStream` to use `mio` to  register interest in the `inotify` file 
descriptor instead, fixing the out-of-control polling. 

When inotify-rs/inotify#105 is merged upstream, we can go back to 
depending on the master version of the library.

Fixes #1261

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-03 20:16:12 -07:00
Oliver Gould b9b35ec11c proxy: Handle connection close during TLS detection (#1256)
During protocol detection, we buffer data to detect a TLS Client Hello
message. If the client disconnects while this detection occurs, we do
not properly handle the disconnect, and the proxy may busy loop.

To fix this, we must handle the case where `read(2)` returns 0 by
creating a `Connection` with the already-closed socket.

While doing this, I've moved some of the implementation of
`ConditionallyUpgradeServerToTls::poll` into helpers on
`ConditionallyUpgradeServerToTlsInner` so that the poll method is easier
to read, hiding the inner details from the polling logic.
2018-07-03 15:36:48 -07:00
Eliza Weisman 1e39ab6ac4 proxy: Add a Prometheus metric for reporting errors loading TLS configs (#1141)
This PR adds a Prometheus stat tracking the number of times
TLS config files have been reloaded, and the number of times
reloading those files has errored. 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-07-03 15:24:20 -07:00
Eliza Weisman 5a3b1cdb3a proxy: Add TLS label in `transparency::retry_reconnect_errors` test (#1258) 2018-07-03 12:27:08 -07:00
Oliver Gould 866167a955 tap: Support `tls` labeling (#1244)
The proxy's metrics are instrumented with a `tls` label that describes
the state of TLS for each connection and associated messges.

This same level of detail is useful to get in `tap` output as well.

This change updates Tap in the following ways:
* `TapEvent` protobuf updated:
  * Added `source_meta` field including source labels
  * `proxy_direction` enum indicates which proxy server was used.
* The proxy adds a `tls` label to both source and destination meta indicating the state of each peer's connection
* The CLI uses the `proxy_direction` field to determine which `tls` label should be rendered.
2018-07-02 17:19:20 -07:00
Oliver Gould 051a7639c5 proxy: Always inlcude `tls` label in metrics (#1243)
The `tls` label could sometimes be formatted incorrectly, without a
preceding comma.

To fix this, the `TlsStatus` type no longer formats commas so that they
must be provided in the context in which they are used (as is done
otherwise in this file).
2018-07-02 16:21:06 -07:00
Eliza Weisman 91108a2d53 proxy: Fall back to plaintext communication when a TLS handshake fails (#1173)
This branch modifies the proxy's logic for opening a connection so
that when an attempted TLS handshake fails, the proxy will retry that
connection without TLS.

This is implemented by changing the `UpgradeToTls` case in the `Future`
implementation for `Connecting`, so that rather than simply wrapping
a poll to the TLS upgrade future with `try_ready!` (and thus failing
the future if the upgrade future fails), we reset the state of the
future to the `Plaintext` state and continue looping. The `tls_status`
field of the future is changed to `ReasonForNoTls::HandshakeFailed`,
and the `Plaintext` state is changed so that if its `tls_status` is
`HandshakeFailed`, it will no longer attempt to upgrade to TLS when the
plaintext connection is successfully established.

Closes #1084 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-29 17:08:03 -07:00
Brian Smith da61aace6c Proxy: Skip TLS for control plane loopback connections. (#1229)
If the controller address has a loopback host then don't use TLS to connect
to it. TLS isn't needed for security in that case. In mormal configurations
the proxy isn't terminating TLS for loopback connections anyway.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-28 17:24:09 -10:00
Brian Smith 03814c18eb Proxy: Get identity of pod & controller from configuration. (#1221)
Instead of attempting to construct identities itself, have the proxy
accept fully-formed identities from whatever configures it. This allows
us to centralize the formatting of the identity strings in the Go code
that is shared between the `conduit inject`, `conduit install`, and CA
components.

One wrinkle: The pod namespace isn't necessarily available at
`conduit inject` time, so the proxy must implement a simple variable
substitution mechanism to insert the pod namespace into its identity.

This has the side-effect of enabling TLS to the controller since the
controller's identity is now available.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-27 17:17:34 -10:00
Brian Smith c67546653a Proxy: Use new destination service TLS identity scheme. (#1222)
Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-27 14:47:57 -10:00
Eliza Weisman af7b56f963 proxy: Replace >=100,000 ms latency buckets with 1, 2, 3, 4, and 5 ms (#1218)
This branch adds buckets for latencies below 10 ms to the proxy's latency
histograms, and removes the buckets for 100, 200, 300, 400, and 500 
seconds, so the largest non-infinity bucket is 50,000 ms. It also removes
comments that claimed that these buckets were the same as those created
by the control plane, as this is no longer true (the metrics are now scraped
by Prometheus from the proxy directly).

Closes #1208

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-27 16:53:42 -07:00
Kevin Lingerfelt 26d2bce656 Update dest service with a different tls identity strategy (#1215)
* Update dest service with a different tls identity strategy
* Send controller namespace as separate field

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-06-27 11:40:02 -07:00
Eliza Weisman 0cf3a231a2 proxy: Fix ConditionallyUpgradeServerToTls not being notified (#1209)
#1203 introduced a bug in the implementation of `Future` for 
`connection::ConditionallyUpgradeServerToTls`. If the attempt to match
the current peek buffer was incomplete, the `Future` implementation
would return `Ok(Async::NotReady)`. This results in the task yielding.
However, in this case the task would not be notified again, as the 
`NotReady` state wasn't from an underlying IO resource. Instead, the
would _never_ be ready.

This branch fixes this issue by simply continuing the loop, so that 
we instead try to read more bytes from the socket and try to match
again, until the match is successful or the _socket_ returns `NotReady`.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-26 16:55:55 -07:00
Sean McArthur ead3495d4a proxy: enable HTTP CONNECT request support (#1200)
When the proxy receives a `CONNECT` request, the HTTP Upgrade pieces
are used since a CONNECT is very similar to an Upgrade. If the CONNECT
response back from the proxied client request is successful, the
connection is converted into a TCP proxy, just like with Upgrades.
2018-06-26 16:45:06 -07:00
Eliza Weisman ae8b7d7391 proxy: Fix false positives in polling-based fs watches (#1140)
There are currently two issues which can lead to false positives (changes being
reported when files have not actually changed) in the polling-based filesystem
watch implementation. 

The first issue is that when checking each watched file for changes, the loop
iterating over each path currently short-circuits as soon as it detects a 
change. This means that if two or more files have changed, the first time we
poll the fs, we will see the first change, then if we poll again, we will see
the next change, and so on. 

This branch fixes that issue by always hashing all the watched files, even if a
change has already been detected. This way, if all the files change between one
poll and the next, we no longer generate additional change events until a file
actually changes again.

The other issue is that the old implementation would treat any instance of a 
"file not found" error as indicating that the file had been deleted, and 
generate a change event. This leads to changes repeatedly being detected as
long as a file does not exist, rather than a single time when the file's 
existence state actually changes.

This branch fixes that issue as well, by only generating change events on
"file not found" errors if the file existed the last time it was polled. 
Otherwise, if a file did not previously exist, we no longer generate a new 
event.

I've verified both of these fixes through manual testing, as well as a new
test for the second issue. The new test fails on master but passes on this
branch.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-26 15:56:21 -07:00
Brian Smith 72761d4552 Proxy: Allow non-Conduit-bound TLS and non-TLS through. (#1203)
On the server (accept) side of TLS, if the traffic isn't targetting the
proxy (as determined by the TLS ClientHello SNI), or if the traffic
isn't TLS, then pass it through.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-26 10:16:00 -10:00
Brian Smith ba3b40f546 Proxy: Add `transport::prefixed::Prefixed`. (#1196)
Copy most of the implementation of `connection::Connection` to create
a way to prefix a `TcpStream` with some previously-read bytes. This
will allow us to read and parse a TLS ClientHello message to see if it
is intended for the proxy to process, and then "rewind" and feed it
back into the TLS implementation if so.

This must be in the `transport` submodule in order for it to implement
the private `Io` trait.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-25 12:12:53 -10:00
Brian Smith 04007755f6 Proxy: Add parser to distinguish proxy TLS traffic from other traffic. (#1197)
* Proxy: Add parser to distinguish proxy TLS traffic from other traffic.

Distinguish incoming TLS traffic intended for the proxy to terminate
from TLS traffic intended for the proxied service to terminate and from
non-TLS traffic.

The new version of `untrusted` is required for this to work.

Signed-off-by: Brian Smith <brian@briansmith.org>

* More tests

Signed-off-by: Brian Smith <brian@briansmith.org>

* Stop abusing `futures::Async`.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-25 12:12:25 -10:00
Eliza Weisman 42dd8b86a4 proxy: Remove TLS client config from Process context (#1201)
As the TLS client config watch stored in `ctx::Process` is used only in
`Bind`, it's not necessary for it to be part of the process context.
Instead, it can be explicitly passed into `Bind`.

The resultant code is simpler, and resolves a potential cyclic 
dependency caused when adding `Sensors` to the watch (see 
https://github.com/runconduit/conduit/pull/1141#issuecomment-400082357).

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-25 15:12:03 -07:00
Eliza Weisman a8e54fbd7d proxy: Rebind controller client on TLS configuration changes (#1192)
This branch adds the rebinding logic added to outbound clients in #1185
to the controller client used in the proxy's `control::destination::background`
module. Now, if we are communicating with the control plane over TLS, we will
rebind the controller client stack if the TLS client configuration changes, 
using the `WatchService` added in  #1177.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Brian Smith <brian@briansmith.org>
Co-authored-by: Brian Smith <brian@briansmith.org>
2018-06-25 12:44:51 -07:00
Oliver Gould 7f84c66474 proxy: Move contrul utils into module (#1198)
control/mod.rs contains a variety of miscelaneous utilities. In
preparation of adding other types into the root of `control`, this
change creates a `control::util` module that holds them.
2018-06-25 11:05:48 -07:00
Brian Smith 6e793f3ef4 Refactor TLS configuration tests. (#1194)
Rearrange the TLS configuration loading tests to enable them to be
extended outside the tls::config submodule.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-24 21:45:56 -10:00
Brian Smith e0a9d9cba9 Report disabled TLS as "disabled" not "no_config". (#1190)
This fixes a regression introduced in PR #1187.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-22 21:48:24 -10:00
Brian Smith 7f7bc1edb8 Proxy: use `Conditional` inside TLS configuration watches. (#1187)
Simplify the code and make it easier to report finer-grained
reasoning about what part(s) of the TLS configuration are
missing.

This is based on Eliza's PR #1186.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-21 17:59:41 -10:00
Eliza Weisman 684c4d6d6d proxy: Add process stats to proxy metrics (on Linux) (#1128)
This branch adds process stats to the proxy's metrics, as described in
https://prometheus.io/docs/instrumenting/writing_clientlibs/#process-metrics.

In particular, it adds metrics for the process's total CPU time, number of 
open file descriptors and max file descriptors, virtual memory size, and 
resident set size.

This branch adds a dependency on the `procinfo` crate. Since this crate and the
syscalls it wraps are Linux-specific, these stats are only reported on Linux.
On other operating systems, they aren't reported.

Manual testing

Metrics scrape:
```
eliza@ares:~$ curl http://localhost:4191/metrics
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 19
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1024
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 45252608
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 12132352
# HELP process_start_time_seconds Time that the process started (in seconds since the UNIX epoch)
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1529017536
```
Note that the `process_cpu_seconds_total` stat is 0 because I just launched this conduit instance and it's not seeing any load; it does go up after i sent a few requests to it.

Confirm RSS & virtual memory stats w/ `ps`, and get Conduit's pid so we can check the fd stats
(note that `ps` reports virt/rss in kb while Conduit's metrics reports them in bytes):
```
eliza@ares:~$ ps aux | grep conduit | grep -v grep
eliza    16766  0.0  0.0  44192 12956 pts/2    Sl+  16:05   0:00 target/debug/conduit-proxy
```

Count conduit process's open fds:
```
eliza@ares:~$ cd /proc/16766/fd
eliza@ares:/proc/16766/fd$ ls -l | wc -l
18
```

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-21 17:50:10 -07:00
Eliza Weisman 1c5524ff21 proxy: Rebind endpoint stacks on TLS config changes using WatchService (#1185)
This branch changes the proxy's `Bind` module to add a middleware layer 
which watches for TLS cliend configuration changes and rebinds the 
endpoint stacks of any endpoints with which it is able to communicate with over 
TLS (i.e. those with `TlsIdentity` metadata) when the client config changes. The 
rebinding is done at the level of individual endpoint stacks, rather than for the
entire service stack for the destination.

This obsoletes my previous PRs #1169 and #1175.

Closes #1161

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-06-21 17:49:19 -07:00
Oliver Gould 432695a77d proxy: Implement a WatchService (#1177)
WatchService is a middleware that rebinds its inner service
each time a Watch updates.

This is planned to be used to rebind endpoint stacks when TLS
configuration changes. Later, it should probably be moved into
the tower repo.
2018-06-21 15:41:41 -07:00
Oliver Gould c0fd7a9577 proxy: Document tls::config::watch_for_config_changes (#1176)
While investigating TLS configuration, I found myself wanting a
docstring on `tls::config::watch_for_config_changes`.

This has one minor change in functionality: now, `future::empty()`
is returned instead of `future:ok(())` so that the task never completes.
It seems that, ultimately, we'll want to treat it as an error if we lose
the ability to receive configuration updates.
2018-06-21 11:05:03 -07:00
Brian Smith 71ea2c1b59 Proxy: Implement TLS conditional accept more like TLS conditional connect. (#1166)
* Proxy: Implement TLS conditional accept more like TLS conditional connect.

Clean up the accept side of the TLS configuration logic.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-06-20 19:46:31 -10:00