Commit Graph

51 Commits

Author SHA1 Message Date
Eliza Weisman d9112abc93
proxy: remove unused metrics (#826)
This PR removes the unused `request_duration_ms` and `response_duration_ms` histogram metrics from the proxy. It also removes them from the `simulate-proxy` script's output, and from `docs/proxy-metrics.md`

Closes #821
2018-04-23 16:05:20 -07:00
Eliza Weisman 455c99d4d4
Ignore flaky metrics tests on CI (#832)
Fixes #831.

Proxy metrics tests `transport::inbound_tcp_accept` and `transport::inbound_tcp_duration` are known to be flaky and should be ignored on CI. Note that the outbound versions of these tests were already marked as flaky, so this was almost certainly either an oversight or the result of an incorrect merge.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-04-23 14:29:34 -07:00
Eliza Weisman 79304cdcaf
proxy: Unbreak process_start_time_seconds metric (#825)
The refactoring of how metrics are formatted in 674ce87588 inadvertently introduced a bug that caused the `process_start_time_seconds` metric to be formatted as just a number without the metric name. This causes Prometheus to fail with a parse error rather than accepting the metrics.

I've fixed this issue, and added a unit test to detect regressions in the future.
2018-04-20 15:59:08 -07:00
Eliza Weisman c19da70965
proxy: Add classifications to TCP close stats (#790)
This PR adds a `classification` label to transport level metrics collected on transport close. Like the `classification` label on HTTP response metrics, the value may be either `"success"` or `"failure"`. The label value is determined based on the `clean` field on the `TransportClose` event, which indicates whether a transport closed cleanly or due to an error.

I've updated the tests for transport-level metrics to reflect the addition of the new label. I'd like to also modify the test support code to allow us to close transports with errors, in order to test that the errors are correctly classified as failures.
2018-04-19 19:01:48 -07:00
Eliza Weisman 674ce87588
proxy: Add transport-level metrics (#785)
This branch adds all the transport-level Prometheus metrics as described in #742, with the exception of the `tcp_connections_open` gauge (to be added in a subsequent branch). 

A brief description of the metrics added in this branch:
* `tcp_accept_open_total`: counter of the number of connections accepted by the proxy
* `tcp_accept_close_total`: counter of the number of accepted connections that have closed
* `tcp_connect_open_total`: counter of the number of connections opened by the proxy
* `tcp_connect_close_total`: counter of the number of connections opened by the proxy that have been closed.
* `tcp_connection_duration_ms`: histogram of the total duration of each TCP connection (incremented on connection close)
* `sent_bytes`: counter of the total number of bytes sent on TCP connections (incremented on connection close)
* `received_bytes`: counter of the total number of bytes received on TCP connections (incremented on connection close)

 These metrics are labeled with the direction (inbound or outbound) and whether the connection was proxied as raw TCP or corresponds to an HTTP request. 

Additionally, I've added several proxy tests for these metrics. Note that there are some cases which are currently untested; in particular, while there are tests for the `tcp_accept_close_total` counter, it's more difficult to test the `tcp_connect_close_total` counter, due to connection pooling. I'd like to improve the tests for this code in additional branches.
2018-04-19 17:27:43 -07:00
Oliver Gould 491fae7cc4
proxy: Rewrite mock controller to accept a stream of dst updates (#808)
Currently, the mock controller, which is used in tests, takes all of its
updates a priori, which makes it hard to control when an update occurs within a
test.

Now, the controller exposes a `DstSender`, which wraps an unbounded channel of
destination updates. This allows tests to trigger updates at a specific point
in the test.

In order to accomplish this, the controller's hand-rolled gRPC server
implementation has been discarded in favor of a real gRPC destination service.
This requires that the `controller-grpc` project now builds both clients
and servers for the destination service. Additionally, we now build a tap
client as well (assuming that we'll want to write tests against our tap
server).
2018-04-19 11:01:10 -07:00
Eliza Weisman 6121afb6f2
Factor out reused test fixtures from telemetry tests (#782)
This is a fairly minor refactor to the proxy telemetry tests. b07b554d2b added a `Fixture` in the Destination service labeling tests added in #661 to reduce the repetition of copied and pasted code in those tests. I've refactored most of the other telemetry tests to also use the test fixture. Significantly less code is copied and pasted now.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-04-17 14:15:56 -07:00
Sean McArthur 3cd16e8e40
proxy: clean up some logs and a few warnings in proxy tests (#780)
Signed-off-by: Sean McArthur <sean@seanmonstar.com>
2018-04-17 12:53:20 -07:00
Oliver Gould efdfc93b50
Stop pushing telemetry reports from the proxy (#616)
Now that the controller does not depend on pushed telemetry reports, the
proxy need not depend on the telemetry API or maintain legacy sampling
logic.
2018-04-12 17:39:29 -07:00
Eliza Weisman 61d15a6c3e
Ignore flaky telemetry tests on CI (#752)
The tests for label metadata updates from the control plane are flaky on CI. This is likely due to the CI containers not having enough cores to execute the test proxy thread, the test proxy's controller client thread, the mock controller thread, and the test server thread simultaneously --- see #751 for more information. 

For now, I'm ignoring these on CI. Eventually, I'd like to change the mock controller code in test support so that we can trigger it to send a second metadata update only after the request has finished.

I think this issue also makes merging #738 a higher priority, so that we can still have some tests running on CI that exercise some part of the label update behaviour.
2018-04-12 14:59:17 -07:00
Eliza Weisman b07b554d2b
Add labels from service discovery to proxy metrics reports (#661)
PR #654 adds pod-based metric labels to the Destination API responses for cluster-local services. 

This PR modifies the proxy to actually add these labels to reported Prometheus metrics for outbound requests to local services. 

It enhances the proxy's `control::discovery` module to track these labels and add a `LabelRequest` middleware to the service stack built in `Bind` for labeled services. Requests transiting `LabelRequest` are given an `Extension` which contains these labels, which are then added to events produced by the `Sensors` for these requests. When these events are aggregated to Prometheus metrics, the labels are added.

I've also added some tests in `test/telemetry.rs` ensuring that these metrics are added correctly when the Destination service provides labels.

Closes #660

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-04-12 12:54:38 -07:00
Sean McArthur 7f54b5253d
proxy: fix flaky tcp graceful shutdown test (#735) 2018-04-10 19:47:00 -07:00
Sean McArthur 02c6887020
proxy: improve graceful shutdown process (#684)
- The listener is immediately closed on receipt of a shutdown signal.
- All in-progress server connections are now counted, and the process will
  not shutdown until the connection count has dropped to zero.
- In the case of HTTP1, idle connections are closed. In the case of HTTP2,
  the HTTP2 graceful shutdown steps are followed of sending various
  GOAWAYs.
2018-04-10 14:15:37 -07:00
Brian Smith 7319cf648f
Proxy: Do L7 load balancing for all external HTTP services. (#726)
Previously when the proxy could tell, by parsing, the request-target
is not in the cluster, it would not override the destination. That is,
load balancing would be disabled for such destinations.

With this change, the proxy will do L7 load balancing for all HTTP
services as long as the request-target has a DNS name.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-04-10 08:07:16 -10:00
Eliza Weisman 605e68dff6
Add pretty durations to panics from `assert_eventually!` (#677)
This PR adds the pretty-printing for durations I added in #676 to the panic message from the `assert_eventually!` macro added in #669. 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-04-06 10:49:17 -07:00
Eliza Weisman 49bf01b0da
Add `assert_eventually!` macro to help de-flake telemetry tests (#669)
Closes #615.

Based on @olix0r's suggestion in https://github.com/runconduit/conduit/issues/613#issuecomment-376024744, this PR adds an `assert_eventually!` macro to retry an assertion a set number of times, waiting for 15 ms between retries. This is loosely based on ScalaTest's [eventually](http://doc.scalatest.org/1.8/org/scalatest/concurrent/Eventually.html).

I've rewritten the flaky telemetry tests to use the `assert_eventually!` macro, to compensate for delays in the served metrics being updated between client requests and metrics scrapes.
2018-04-05 11:23:34 -07:00
Phil Calçado 19001f8d38 Add pod-based metric_labels to destinations response (#429) (#654)
* Extracted logic from destination server
* Make tests follow style used elsewhere in the code
* Extract single interface for resolvers
* Add tests for k8s and ipv4 resolvers
* Fix small usability issues
* Update dep
* Act on feedback
* Add pod-based metric_labels to destinations response
* Add documentation on running control plane to BUILD.md

Signed-off-by: Phil Calcado <phil@buoyant.io>

* Fix mock controller in proxy tests (#656)

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

* Address review feedback
* Rename files in the destination package

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-04-02 18:36:57 -07:00
Sean McArthur 47f9665b8e
proxy: allow disable protocol detection on specific ports (#648)
- Adds environment variables to configure a set of ports that, when an
  incoming connection has an SO_ORIGINAL_DST with a port matching, will
  disable protocol detection for that connection and immediately start a
  TCP proxy.
- Adds a default list of well known ports: SMTP and MySQL.

Closes #339
2018-04-02 14:24:36 -07:00
Brian Smith f931dec3b3
Proxy: Completely replace current set of destinations on reconnect (#632)
Previosuly, when the proxy was disconnected from the Destination
service and then reconnects, the proxy would not forget old, outdated
entries in its cache of endpoints. If those endpoints had been removed
while the proxy was disconnected then the proxy would never become
aware of that.

Instead, on the first message after a reconnection, replace the entire
set of cached entries with the new set, which may be empty.

Prior to this change, the new test
outbound_destinations_reset_on_reconnect_followed_by_no_endpoints_exists
passed already
but outbound_destinations_reset_on_reconnect_followed_by_add_none
and outbound_destinations_reset_on_reconnect_followed_by_remove_none
failed. Now all these tests pass.

Fixes #573

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-03-29 16:50:08 -10:00
Eliza Weisman c688cf6028
Add response classification to proxy metrics (#639)
This PR adds a `classification` label to proxy response metrics, as @olix0r described in https://github.com/runconduit/conduit/issues/634#issuecomment-376964083. The label is either "success" or "failure", depending on the following rules:
+ **if** the response had a gRPC status code, *then*
   - gRPC status code 0 is considered a success
   - all others are considered failures
+ **else if** the response had an HTTP status code, *then*
  - status codes < 500 are considered success,
  - status codes >= 500 are considered failures
+ **else if** the response stream failed **then**
  - the response is a failure.

I've also added end-to-end tests for the classification of HTTP responses (with some work towards classifying gRPC responses as well). Additionally, I've updated `doc/proxy_metrics.md` to reflect the added `classification` label.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-03-28 14:49:00 -07:00
Brian Smith 7247ffeee3
Proxy: Clarify destination test support code queue handling (#617)
Use `VecDeqeue` to make the queue structure clear. Follow good practice
by minimizing the amount of time the lock is held. Clarify how
defaulting logic works.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-03-26 10:45:05 -10:00
Oliver Gould 006360aa90
Skip flaky tests for #613 (#614)
The metrics endpoint tests are flaky because there are no guarantees
that the metrics pipeline has processed events before the metrics
endpoint is read. This can cause CI to fail spuriously.

Disable these tests from running in CI until #613 is resolved.
2018-03-25 14:26:14 -07:00
Eliza Weisman 9321932918
Add request_duration_ms metric and increment request_total on request end (#589)
This PR adds the `request_duration_ms` metric to the Prometheus metrics exported by the proxy. It also modifies the `request_total` metric so that it is incremented when a request stream finishes, rather than when it opens, for consistency with how the `response_total` metric is generated.

Making this change required modifying `telemetry::sensors::http` to generate a `StreamRequestEnd` event similar to the `StreamResponseEnd` event. This is done similarly to how sensors are added to response bodies, by generalizing the `ResponseBody` type into a `MeasuredBody` type that can wrap a request or response body. Since this changed the type of request bodies, it necessitated changing request types pretty much everywhere else in the proxy codebase in order to fix the resulting type errors, which is why the diff for this PR is so large.

Closes #570
2018-03-22 15:27:34 -07:00
Eliza Weisman 3f10c80256
Fix double comma in outbound metrics (#601)
Fixes #600 

The proxy metrics endpoint has a bug where metrics recorded in the outbound direction can contain two commas in a row when no outbound label is present. This occurs because the code for formatting the outbound direction label mistakenly assumed that there would always be a destination pod owner label as well, but the proxy isn't currently aware of the destination's pod owner (waiting for #429). 

I've fixed this issue by moving the place where the comma is output from the `fmt::Display` impl for `RequestLabels` to the `fmt::Display` impl for `OutboudnLabels`. This way, the comma between the `direction` and `dst_*` labels is only output when the `dst_*` label is present. 

This bug made it to master since all of the proxy end-to-end tests for metrics only test the inbound router. I've rectified this issue by adding tests on the outbound router as well (which would fail against the current master due to the double comma bug). I've also added a test that asserts there are no double commas in exported metrics, to protect against regressions to this bug.
2018-03-22 14:17:10 -07:00
Eliza Weisman 1c9ce4d118
Add Prometheus /metrics endpoint to proxy (#569)
This PR adds an endpoint to the proxy that serves metrics in Prometheus' text exposition format. The endpoint currently serves the `request_total`, `response_total`, `response_latency_ms`, and `response_duration_ms metrics`, as described in #536. The endpoint's port and address are configurable with the `CONDUIT_PROXY_METRICS_LISTENER` environment variable.

Tests have been added in t`ests/telemetry.rs`
2018-03-21 16:19:32 -07:00
Sean McArthur cd59465366
proxy: add SIGTERM and SIGINT handlers (#581)
When the proxy is run in a Docker container, it runs as PID 1, with
no default signal handlers setup. In order to react to signals from
Kubernetes about shutting down, we need to set up explicit handlers.

This adds handlers for SIGTERM and SIGINT.

Closes #549
2018-03-16 18:53:20 -07:00
Eliza Weisman 8da70bb6e2
Run all discovery tests for HTTP/1 as well as HTTP/2 (#556)
In order to ensure we catch discovery and routing issues arising from different logic for HTTP/1 and HTTP/2 requests, I've modified tests/discovery.rs to run all applicable tests with both HTTP/1 and HTTP/2 requests. The tests themselves are largely unchanged, but now there are separate modules containing HTTP/1 and HTTP/2 versions of a majority of the tests.
2018-03-09 17:24:48 -08:00
Eliza Weisman d62a869e68
Fix outbound HTTP/1 requests not using Destinations (#555)
Commit 569d6939a7 introduced a regression that caused the proxy to stop using the Destination service for outbound HTTP/1 requests with no authority in the request URI but a valid authority in the `Host:` header. 

The bug is due to some code in `Outbound::recognize` which assumed that a request had already been passed through `normalize_our_view_of_uri`. This was valid at one point while I was writing #492, as URIs were normalized prior to `recognize` and a request `Extension` was used to mark that they had been rewritten, and the host header and request URI could be assumed to be in agreement, but after merging #514 into the dev branch for #492, this behaviour changed and I forgot to update the logic in `recognize`.

I've fixed the issue by adding the logic for routing on `Host:` headers back into `Outbound::recognize`.

@seanmonstar added a test in `discovery.rs`, `outbound_http1_asks_controller_about_host`, which should exercise this case. I've added a couple more unit tests in that file to try and ensure we cover more of the different cases that can occur here.

Fixes #552
2018-03-09 16:25:19 -08:00
Sean McArthur 83d6a1f579
proxy: improve transparency of host headers and absolute-uris (#535)
In some cases, we would adjust an existing Host header, or add one. And in all cases when an HTTP/1 request was received with an absolute-form target, it was not passed on.

Now, the Host header is never changed. And if the Uri was in absolute-form, it is sent in the same format.

Closes #518
2018-03-08 13:15:21 -08:00
Brian Smith 649e784d9c
Simplify cluster zone suffix handling in the proxy (#528)
* Temporarily stop trying to support configurable zones in the proxy.

None of the zone configuration is tested and lots of things assume the cluster
zone is `cluster.local`. Further, how exactly the proxy will actually learn the
cluster zone hasn't been decided yet.

Just hard-code the zone as "cluster.local" in the proxy until configurable zones
are fully implemented and tested to be working correctly.

Signed-off-by: Brian Smith <brian@briansmith.org>

* Remove the CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN setting

The way that Kubernetes configures DNS search suffixes has some negative
consequences as some names like "example.com" are ambiguous: depending on
whether there is a service "example" in the "com" namespace, "example.com"
may refer to an external service or an internal service, and this can
fluctuate over time. In recognition of that we added the
CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN setting, thinking this would
be part of a solution for users to opt out of the unfortunate behavior
if their applications didn't depend on the DNS search suffix feature.

It turns out similar effects can be acheived using a custom dnsConfig,
starting in Kubernetes 1.10 when dnsConfig reaches the beta stability level.
Now any CONDUIT_PROXY_DESTINATIONS_AUTOCOMPLETE_FQDN-based seems duplicative.
Further, attempting to support it optionally made the code complex and hard
to read.

Therefore, let's just remove it. If/when somebody actually requests this
functionality then we can add it back, if dnsConfig isn't a valid alternative
for them.

Signed-off-by: Brian Smith <brian@briansmith.org>

* Further hard-code "cluster.local" as the zone, temporarily.

Addresses review feedback.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-03-07 14:30:13 -10:00
Eliza Weisman 569d6939a7
Enforce that requests are mapped to connections for each Host: header values (#492)
This PR ensures that the mapping of requests to outbound connections is segregated by `Host:` header values. In most cases, the desired behavior is provided by Hyper's connection pooling. However, Hyper does not handle the case where a request had no `Host:` header and the request URI had no authority part, and the request was routed based on the SO_ORIGINAL_DST in the desired manner. We would like these requests to each have their own outbound connection, but Hyper will reuse the same connection for such requests. 

Therefore, I have modified `conduit_proxy_router::Recognize` to allow implementations of `Recognize` to indicate whether the service for a given key can be cached, and to only cache the service when it is marked as cachable. I've also changed the `reconstruct_uri` function, which rewrites HTTP/1 requests, to mark when a request had no authority and no `Host:` header, and the authority was rewritten to be the request's ORIGINAL_DST. When this is the case, the `Recognize` implementations for `Inbound` and `Outbound` will mark these requests as non-cachable.

I've also added unit tests ensuring that A, connections are created per `Host:` header, and B, that requests with no `Host:` header each create a new connection. The first test passes without any additional changes, but the second only passes on this branch. The tests were added in PR #489, but this branch supersedes that branch.

Fixes #415. Closes #489.
2018-03-06 16:44:14 -08:00
Sean McArthur c278228c1b
proxy: preserve body headers in http1 (#457)
As a goal of being a transparent proxy, we want to proxy requests and
responses with as little modification as possible. Basically, servers
and clients should see messages that look the same whether the proxy was
injected or not.

With that goal in mind, we want to make sure that body headers (things
like `Content-Length`, `Transfer-Encoding`, etc) are left alone. Prior
to this commit, we at times were changing behavior. Sometimes
`Transfer-Encoding` was added to requests, or `Content-Length: 0` may
have been removed. While RC 7230 defines that differences are
semantically the same, implementations may not handle them correctly.

Now, we've added some fixes to prevent any of these header changes
from occurring, along with tests to make sure library updates don't
regress.

For requests:

- With no message body, `Transfer-Encoding: chunked` should no longer be
added.
- With `Content-Length: 0`, the header is forwarded untouched.

For responses:

- Tests were added that responses not allowed to have bodies (to HEAD
requests, 204, 304) did not have `Transfer-Encoding` added.
- Tests that `Content-Length: 0` is preserved.
- Tests that HTTP/1.0 responses with no body headers do not have
`Transfer-Encoding` added.
- Tests that `HEAD` responses forward `Content-Length` headers (but not
an actual body).

Closes #447

Signed-off-by: Sean McArthur <sean@seanmonstar.com>
2018-03-05 18:10:51 -08:00
Sean McArthur f9d8f3d94a
proxy: detect TCP socket hang ups from client or server (#463)
We previously `join`ed on piping data from both sides, meaning
that the future didn't complete until **both** sides had disconnected.
Even if the client disconnected, it was possible the server never knew,
and we "leaked" this future.

To fix this, the `join` is replaced with a `Duplex` future, which pipes
from both ends into the other, while also detecting when one side shuts
down. When a side does shutdown, a write shutdown is forwarded to the
other side, to allow draining to occur for deployments that half-close
sockets.

Closes #434
2018-03-02 10:14:54 -08:00
Brian Smith d993820cb3
Fix intermittent outbound_times_out failure. (#471)
This was caused by the fact that a new instance of `env_logger::init()`
was added after the PR that rewrote them all to `env_logger::try_init()`
was added.

Fixes #469

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 19:07:36 -10:00
Brian Smith 6b4d294a40
Reduce memory allocations during logging. (#445)
Stop initializing env_logger in every test. In env_logger 0.5, it
may only be initialized once per process.

Also, Prost will soon upgrade to env_logger 0.5 and this will
(eventually) help reduce the number of versions of env_logger we
have to build. Turning off the regex feature will (eventually) also
reduce the number of dependencies we have to build. Unfortunately,
as it is now, the number of dependencies has increased because
env_logger increased its dependencies in 0.5.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 18:32:47 -10:00
Brian Smith 8607875267
Stop using the url crate in the proxy. (#450)
Version 1.7.0 of the url crate seems to be broken which means we cannot
`cargo update` the proxy without locking url to version 1.6. Since we only
use it in a very limited way anyway, and since we use http::uri for parsing
much more, just switch all uses of the url crate to use http::uri for parsing
instead.

This eliminates some build dependencies.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 08:55:48 -10:00
Eliza Weisman 694f691b71
Add timeout to Outbound::bind_service (#436)
Closes #403.

When the Destination service does not return a result for a service, the proxy connection for that service will hang indefinitely waiting for a result from Destination. If, for example, the requested name doesn't exist, this means that the proxy will wait forever, rather than responding with an error.

I've added a timeout wrapping the service returned from `<Outbound as Recognize>::bind_service`. The timeout can be configured by setting the `CONDUIT_PROXY_BIND_TIMEOUT` environment variable, and defaults to 10 seconds (because that's the default value for [a similar configuration in Linkerd](https://linkerd.io/config/1.3.5/linkerd/index.html#router-parameters)).

Testing with @klingerf's reproduction from #403:
```
curl -sIH 'Host: httpbin.org' $(minikube service proxy-http --url)/get | head -n1
HTTP/1.1 500 Internal Server Error
```
proxy logs:
```rust
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy using controller at HostAndPort { host: Domain("proxy-api.conduit.svc.cluster.local"), port: 8086 }
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy routing on V4(127.0.0.1:4140)
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy proxying on V4(0.0.0.0:4143) to None
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy::transport::connect "controller-client", DNS resolved proxy-api.conduit.svc.cluster.local to 10.0.0.240
proxy-5698f79b66-8rczl conduit-proxy ERR! conduit_proxy::map_err turning service error into 500: Inner(Timeout(Duration { secs: 10, nanos: 0 }))
```
2018-02-26 10:18:35 -08:00
Eliza Weisman 6309741ae7
Add flaky_tests feature for skipping some tests on CI (#441)
This PR adds a `flaky_tests` cargo feature to control whether or not to ignore tests that are timing-dependent. This feature is enabled by default in local builds, but disabled on CI and in all Docker builds.

Closes #440
2018-02-26 10:17:53 -08:00
Sean McArthur 381fb3800e
proxy: don't send transfer-encoding for empty GET requests (#410)
This is fixed in hyper v0.11.19.

Closes #402
2018-02-23 16:22:45 -08:00
Sean McArthur 236f71fbe0
proxy: use original dst if authority doesnt look like local service (#397)
The proxy will check that the requested authority looks like a local service, and if it doesn't, it will no longer ask the Destination service about the request, instead just using the SO_ORIGINAL_DST, enabling egress naturally.

The rules used to determine if it looks like a local service come from this comment:

> If default_zone.is_none() and the name is in the form $a.$b.svc, or if !default_zone.is_none() and the name is in the form $a.$b.svc.$default_zone, for some a and some b, then use the Destination service. Otherwise, use the IP given.
2018-02-20 18:09:21 -08:00
Eliza Weisman 8bc497a057
Remove unused metrics (#322)
Removed the `method` label from Prometheus, and removed HTTP methods from reports. Removed `StreamSummary` from reports and replaced it with a `u32` count of streams.

Closes #266 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-09 17:14:17 -08:00
Eliza Weisman 458e9d2ac5
Remove per-path metrics from telemetry pipeline (#317)
Follow-up from #315.

Now that the UIs don't report per-path metrics, we can remove the path label from Prometheus, the path aggregation and filtering options from the telemetry API, and the path field from the proxy report API.

I've modified the tests to no longer expect the removed fields, and manually verified that Conduit still works after making these changes.

Closes #265 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-09 14:20:28 -08:00
Eliza Weisman 915f08ac4c
Store proxy latencies in a structure that matches controller histogram (#11)
The proxy currently stores latency values in an `OrderMap` and reports every observed latency value to the controller's telemetry API since the last report. The telemetry API then sends each individual value to Prometheus. This doesn't scale well when there are a large number of proxies making reports. 

I've modified the proxy to use a fixed-size histogram that matches the histogram buckets in Prometheus. Each report now includes an array indicating the histogram bounds, and each response scope contains a set of counts corresponding to each index in the bounds array, indicating the number of times a latency in that bucket was observed. The controller then reports the upper bound of each bucket to Prometheus, and can use the proxy's reported set of bucket bounds so that the observed values will be correct even if the bounds in the control plane are changed independently of those set in the proxy.

I've also modified `simulate-proxy` to generate the new report structure, and added tests in the proxy's telemetry test suite validating the new behaviour.
2018-02-07 18:02:59 -08:00
Oliver Gould e2093e37f8
Move the Rust gRPC bindings to a dedicated crate (#275)
The proxy depends on `protoc`-generated gRPC bindings to communicate
with the controller. In order to generate these bindings, build-time
dependencies must be compiled.

In order to support a more granular, cacheable build scheme, a new crate
has been created to house these gRPC bindings,
`conduit-proxy-controller-grpc`.

Because `TryFrom` and `TryInto` conversions are implemented for
protobuf-defined types, the `convert` module also had to be moved to
into a dedicated crate.

Furthermore, because the proxy's tests require that
`quickcheck::Aribtrary` be implemented for protobuf types, the
`conduit-proxy-controller-grpc` crate supports an _arbitrary_ feature
fla protobuf types, the `conduit-proxy-controller-grpc` crate supports
an _arbitrary_ feature flag.

While we're moving these libraries around, the `tower-router` crate has
been moved to `proxy/router` and renamed to `conduit-proxy-router.`
`futures-mpsc-lossy` has been moved into the proxy directory but has not
been renamed.

Finally, the `proxy/Dockerfile-deps` image has been updated to avoid the
wasteful building of dependency artifacts, as they are not actually used
by `proxy/Dockerfile`.
2018-02-06 10:31:48 -08:00
Eliza Weisman eddc37de28 Adopt external tower-grpc and tower-h2 deps #225)
The conduit repo includes several library projects that have since been
moved into external repos, including `tower-grpc` and `tower-h2`.

This change removes these vendored libraries in favor of using the new
external crates.
2018-02-01 11:57:02 -08:00
Sean McArthur 9720a32de7
proxy: fix tcp_with_no_orig_dst test (#229)
Sometimes, the try_read will return a connection error, sometimes it
will just return EOF. Handle both cases.

Closes #226
2018-01-29 15:15:06 -08:00
Sean McArthur b861318e86
proxy: fix h1 streams to trigger response end events
Response End events were only triggered after polling the trailers of
a response, but when the Response is given to a hyper h1 server, it
doesn't know about trailers, so they were never polled!

The fix is that the `BodyStream` glue will now poll the wrapped body for
trailers after it sees the end of the data, before telling hyper the
stream is over. This ensures a ResponseEnd event is emitted.

Includes a proxy telemetry test over h1 connections.
2018-01-25 16:36:16 -08:00
Sean McArthur 54aef56e25
proxy: add transparent protocol detection and handling
The proxy will now try to detect what protocol new connections are
using, and route them accordingly. Specifically:

- HTTP/2 stays the same.
- HTTP/1 is now accepted, and will try to send an HTTP/1 request
  to the target.
- If neither HTTP/1 nor 2, assume a TCP stream and simply forward
  between the source and destination.

* tower-h2: fix Server Clone bounds
* proxy: implement Async{Read,Write} extra methods for Connection

Closes #130 
Closes #131
2018-01-23 16:14:07 -08:00
Brian Smith 559f4a76fb
Proxy: Use production config parsing in tests (#25)
* Proxy: Use production config parsing in tests

Previosuly the testing code for the proxy was sensitive to the values
of environment variables unintentionally, because `Config` looked at
the environment variables. Also, the tests were largely avoiding
testing the production configuration parsing code since they were
doing their own parsing.

Now the tests avoid looking at environment variables other than
`ENV_LOG`, which makes them more resilient. Also the tests now parse
the settings using the same code as production use uses.

I validated this manually.
2017-12-13 19:27:50 -06:00
Oliver Gould 980f85963d apply rustffmt on proxy, remove rustfmt.toml for now 2017-12-05 00:44:16 +00:00