Commit Graph

20 Commits

Author SHA1 Message Date
Sean McArthur c278228c1b
proxy: preserve body headers in http1 (#457)
As a goal of being a transparent proxy, we want to proxy requests and
responses with as little modification as possible. Basically, servers
and clients should see messages that look the same whether the proxy was
injected or not.

With that goal in mind, we want to make sure that body headers (things
like `Content-Length`, `Transfer-Encoding`, etc) are left alone. Prior
to this commit, we at times were changing behavior. Sometimes
`Transfer-Encoding` was added to requests, or `Content-Length: 0` may
have been removed. While RC 7230 defines that differences are
semantically the same, implementations may not handle them correctly.

Now, we've added some fixes to prevent any of these header changes
from occurring, along with tests to make sure library updates don't
regress.

For requests:

- With no message body, `Transfer-Encoding: chunked` should no longer be
added.
- With `Content-Length: 0`, the header is forwarded untouched.

For responses:

- Tests were added that responses not allowed to have bodies (to HEAD
requests, 204, 304) did not have `Transfer-Encoding` added.
- Tests that `Content-Length: 0` is preserved.
- Tests that HTTP/1.0 responses with no body headers do not have
`Transfer-Encoding` added.
- Tests that `HEAD` responses forward `Content-Length` headers (but not
an actual body).

Closes #447

Signed-off-by: Sean McArthur <sean@seanmonstar.com>
2018-03-05 18:10:51 -08:00
Sean McArthur f9d8f3d94a
proxy: detect TCP socket hang ups from client or server (#463)
We previously `join`ed on piping data from both sides, meaning
that the future didn't complete until **both** sides had disconnected.
Even if the client disconnected, it was possible the server never knew,
and we "leaked" this future.

To fix this, the `join` is replaced with a `Duplex` future, which pipes
from both ends into the other, while also detecting when one side shuts
down. When a side does shutdown, a write shutdown is forwarded to the
other side, to allow draining to occur for deployments that half-close
sockets.

Closes #434
2018-03-02 10:14:54 -08:00
Brian Smith d993820cb3
Fix intermittent outbound_times_out failure. (#471)
This was caused by the fact that a new instance of `env_logger::init()`
was added after the PR that rewrote them all to `env_logger::try_init()`
was added.

Fixes #469

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 19:07:36 -10:00
Brian Smith 6b4d294a40
Reduce memory allocations during logging. (#445)
Stop initializing env_logger in every test. In env_logger 0.5, it
may only be initialized once per process.

Also, Prost will soon upgrade to env_logger 0.5 and this will
(eventually) help reduce the number of versions of env_logger we
have to build. Turning off the regex feature will (eventually) also
reduce the number of dependencies we have to build. Unfortunately,
as it is now, the number of dependencies has increased because
env_logger increased its dependencies in 0.5.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 18:32:47 -10:00
Brian Smith 8607875267
Stop using the url crate in the proxy. (#450)
Version 1.7.0 of the url crate seems to be broken which means we cannot
`cargo update` the proxy without locking url to version 1.6. Since we only
use it in a very limited way anyway, and since we use http::uri for parsing
much more, just switch all uses of the url crate to use http::uri for parsing
instead.

This eliminates some build dependencies.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-26 08:55:48 -10:00
Eliza Weisman 694f691b71
Add timeout to Outbound::bind_service (#436)
Closes #403.

When the Destination service does not return a result for a service, the proxy connection for that service will hang indefinitely waiting for a result from Destination. If, for example, the requested name doesn't exist, this means that the proxy will wait forever, rather than responding with an error.

I've added a timeout wrapping the service returned from `<Outbound as Recognize>::bind_service`. The timeout can be configured by setting the `CONDUIT_PROXY_BIND_TIMEOUT` environment variable, and defaults to 10 seconds (because that's the default value for [a similar configuration in Linkerd](https://linkerd.io/config/1.3.5/linkerd/index.html#router-parameters)).

Testing with @klingerf's reproduction from #403:
```
curl -sIH 'Host: httpbin.org' $(minikube service proxy-http --url)/get | head -n1
HTTP/1.1 500 Internal Server Error
```
proxy logs:
```rust
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy using controller at HostAndPort { host: Domain("proxy-api.conduit.svc.cluster.local"), port: 8086 }
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy routing on V4(127.0.0.1:4140)
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy proxying on V4(0.0.0.0:4143) to None
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy::transport::connect "controller-client", DNS resolved proxy-api.conduit.svc.cluster.local to 10.0.0.240
proxy-5698f79b66-8rczl conduit-proxy ERR! conduit_proxy::map_err turning service error into 500: Inner(Timeout(Duration { secs: 10, nanos: 0 }))
```
2018-02-26 10:18:35 -08:00
Eliza Weisman 6309741ae7
Add flaky_tests feature for skipping some tests on CI (#441)
This PR adds a `flaky_tests` cargo feature to control whether or not to ignore tests that are timing-dependent. This feature is enabled by default in local builds, but disabled on CI and in all Docker builds.

Closes #440
2018-02-26 10:17:53 -08:00
Sean McArthur 381fb3800e
proxy: don't send transfer-encoding for empty GET requests (#410)
This is fixed in hyper v0.11.19.

Closes #402
2018-02-23 16:22:45 -08:00
Sean McArthur 236f71fbe0
proxy: use original dst if authority doesnt look like local service (#397)
The proxy will check that the requested authority looks like a local service, and if it doesn't, it will no longer ask the Destination service about the request, instead just using the SO_ORIGINAL_DST, enabling egress naturally.

The rules used to determine if it looks like a local service come from this comment:

> If default_zone.is_none() and the name is in the form $a.$b.svc, or if !default_zone.is_none() and the name is in the form $a.$b.svc.$default_zone, for some a and some b, then use the Destination service. Otherwise, use the IP given.
2018-02-20 18:09:21 -08:00
Eliza Weisman 8bc497a057
Remove unused metrics (#322)
Removed the `method` label from Prometheus, and removed HTTP methods from reports. Removed `StreamSummary` from reports and replaced it with a `u32` count of streams.

Closes #266 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-09 17:14:17 -08:00
Eliza Weisman 458e9d2ac5
Remove per-path metrics from telemetry pipeline (#317)
Follow-up from #315.

Now that the UIs don't report per-path metrics, we can remove the path label from Prometheus, the path aggregation and filtering options from the telemetry API, and the path field from the proxy report API.

I've modified the tests to no longer expect the removed fields, and manually verified that Conduit still works after making these changes.

Closes #265 

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-02-09 14:20:28 -08:00
Eliza Weisman 915f08ac4c
Store proxy latencies in a structure that matches controller histogram (#11)
The proxy currently stores latency values in an `OrderMap` and reports every observed latency value to the controller's telemetry API since the last report. The telemetry API then sends each individual value to Prometheus. This doesn't scale well when there are a large number of proxies making reports. 

I've modified the proxy to use a fixed-size histogram that matches the histogram buckets in Prometheus. Each report now includes an array indicating the histogram bounds, and each response scope contains a set of counts corresponding to each index in the bounds array, indicating the number of times a latency in that bucket was observed. The controller then reports the upper bound of each bucket to Prometheus, and can use the proxy's reported set of bucket bounds so that the observed values will be correct even if the bounds in the control plane are changed independently of those set in the proxy.

I've also modified `simulate-proxy` to generate the new report structure, and added tests in the proxy's telemetry test suite validating the new behaviour.
2018-02-07 18:02:59 -08:00
Oliver Gould e2093e37f8
Move the Rust gRPC bindings to a dedicated crate (#275)
The proxy depends on `protoc`-generated gRPC bindings to communicate
with the controller. In order to generate these bindings, build-time
dependencies must be compiled.

In order to support a more granular, cacheable build scheme, a new crate
has been created to house these gRPC bindings,
`conduit-proxy-controller-grpc`.

Because `TryFrom` and `TryInto` conversions are implemented for
protobuf-defined types, the `convert` module also had to be moved to
into a dedicated crate.

Furthermore, because the proxy's tests require that
`quickcheck::Aribtrary` be implemented for protobuf types, the
`conduit-proxy-controller-grpc` crate supports an _arbitrary_ feature
fla protobuf types, the `conduit-proxy-controller-grpc` crate supports
an _arbitrary_ feature flag.

While we're moving these libraries around, the `tower-router` crate has
been moved to `proxy/router` and renamed to `conduit-proxy-router.`
`futures-mpsc-lossy` has been moved into the proxy directory but has not
been renamed.

Finally, the `proxy/Dockerfile-deps` image has been updated to avoid the
wasteful building of dependency artifacts, as they are not actually used
by `proxy/Dockerfile`.
2018-02-06 10:31:48 -08:00
Eliza Weisman eddc37de28 Adopt external tower-grpc and tower-h2 deps #225)
The conduit repo includes several library projects that have since been
moved into external repos, including `tower-grpc` and `tower-h2`.

This change removes these vendored libraries in favor of using the new
external crates.
2018-02-01 11:57:02 -08:00
Sean McArthur 9720a32de7
proxy: fix tcp_with_no_orig_dst test (#229)
Sometimes, the try_read will return a connection error, sometimes it
will just return EOF. Handle both cases.

Closes #226
2018-01-29 15:15:06 -08:00
Sean McArthur b861318e86
proxy: fix h1 streams to trigger response end events
Response End events were only triggered after polling the trailers of
a response, but when the Response is given to a hyper h1 server, it
doesn't know about trailers, so they were never polled!

The fix is that the `BodyStream` glue will now poll the wrapped body for
trailers after it sees the end of the data, before telling hyper the
stream is over. This ensures a ResponseEnd event is emitted.

Includes a proxy telemetry test over h1 connections.
2018-01-25 16:36:16 -08:00
Sean McArthur 54aef56e25
proxy: add transparent protocol detection and handling
The proxy will now try to detect what protocol new connections are
using, and route them accordingly. Specifically:

- HTTP/2 stays the same.
- HTTP/1 is now accepted, and will try to send an HTTP/1 request
  to the target.
- If neither HTTP/1 nor 2, assume a TCP stream and simply forward
  between the source and destination.

* tower-h2: fix Server Clone bounds
* proxy: implement Async{Read,Write} extra methods for Connection

Closes #130 
Closes #131
2018-01-23 16:14:07 -08:00
Brian Smith 559f4a76fb
Proxy: Use production config parsing in tests (#25)
* Proxy: Use production config parsing in tests

Previosuly the testing code for the proxy was sensitive to the values
of environment variables unintentionally, because `Config` looked at
the environment variables. Also, the tests were largely avoiding
testing the production configuration parsing code since they were
doing their own parsing.

Now the tests avoid looking at environment variables other than
`ENV_LOG`, which makes them more resilient. Also the tests now parse
the settings using the same code as production use uses.

I validated this manually.
2017-12-13 19:27:50 -06:00
Oliver Gould 980f85963d apply rustffmt on proxy, remove rustfmt.toml for now 2017-12-05 00:44:16 +00:00
Oliver Gould b104bd0676 Introducing Conduit, the ultralight service mesh
We’ve built Conduit from the ground up to be the fastest, lightest,
simplest, and most secure service mesh in the world. It features an
incredibly fast and safe data plane written in Rust, a simple yet
powerful control plane written in Go, and a design that’s focused on
performance, security, and usability. Most importantly, Conduit
incorporates the many lessons we’ve learned from over 18 months of
production service mesh experience with Linkerd.

This repository contains a few tightly-related components:
- `proxy` -- an HTTP/2 proxy written in Rust;
- `controller` -- a control plane written in Go with gRPC;
- `web` -- a UI written in React, served by Go.
2017-12-05 00:24:55 +00:00