This is in preparation for landing the Tokio upgrade.
The test `discovery::outbound_updates_newer_services` currently contains an
assertion that an HTTP/2 request to an HTTP/1 service will return a response
with status code 500. This is because the current version of Hyper on which
Conduit depends does not support protocol upgrades.
However, commit hyperium/hyper@bc6af88a32, which
adds support for this kind of protocol upgrade, was recently merged to Hyper's
master branch. Therefore, this assertion will no longer be correct once we
depend on the upcoming Hyper release. When we migrate to the new Tokio, it will
be necessary to upgrade our Hyper dependency as well, and this test will fail.
I've modified the test to no longer make assertions about the response's status
code, so that it's compatible with both the current and future Hyper versions.
If the response is not `Ok`, the test will still fail, since
`tests::support::Client::request()` `expect`s that the response is successful,
but the status code is ignored. I've added a comment in the test explaining
this.
Eventually, when the master version of Conduit depends on the latest Hyper, we
may want to change this test to assert that the status code is 200 instead. We
may also want to add more tests for Hyper's protocol upgrade functionality, but
that seems out of scope for this PR.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Configuration values that take durations are currently specified as
time values with no units. So `600` may mean 600ms in some contexts and
10 minutes in others.
In order to avoid this problem, this change now requires that
configurations provide explicit units for time values such as '600ms' or
10 minutes'.
Fixes#27.
Currently, the mock controller, which is used in tests, takes all of its
updates a priori, which makes it hard to control when an update occurs within a
test.
Now, the controller exposes a `DstSender`, which wraps an unbounded channel of
destination updates. This allows tests to trigger updates at a specific point
in the test.
In order to accomplish this, the controller's hand-rolled gRPC server
implementation has been discarded in favor of a real gRPC destination service.
This requires that the `controller-grpc` project now builds both clients
and servers for the destination service. Additionally, we now build a tap
client as well (assuming that we'll want to write tests against our tap
server).
PR #654 adds pod-based metric labels to the Destination API responses for cluster-local services.
This PR modifies the proxy to actually add these labels to reported Prometheus metrics for outbound requests to local services.
It enhances the proxy's `control::discovery` module to track these labels and add a `LabelRequest` middleware to the service stack built in `Bind` for labeled services. Requests transiting `LabelRequest` are given an `Extension` which contains these labels, which are then added to events produced by the `Sensors` for these requests. When these events are aggregated to Prometheus metrics, the labels are added.
I've also added some tests in `test/telemetry.rs` ensuring that these metrics are added correctly when the Destination service provides labels.
Closes#660
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
- The listener is immediately closed on receipt of a shutdown signal.
- All in-progress server connections are now counted, and the process will
not shutdown until the connection count has dropped to zero.
- In the case of HTTP1, idle connections are closed. In the case of HTTP2,
the HTTP2 graceful shutdown steps are followed of sending various
GOAWAYs.
Previously when the proxy could tell, by parsing, the request-target
is not in the cluster, it would not override the destination. That is,
load balancing would be disabled for such destinations.
With this change, the proxy will do L7 load balancing for all HTTP
services as long as the request-target has a DNS name.
Signed-off-by: Brian Smith <brian@briansmith.org>
Previosuly, when the proxy was disconnected from the Destination
service and then reconnects, the proxy would not forget old, outdated
entries in its cache of endpoints. If those endpoints had been removed
while the proxy was disconnected then the proxy would never become
aware of that.
Instead, on the first message after a reconnection, replace the entire
set of cached entries with the new set, which may be empty.
Prior to this change, the new test
outbound_destinations_reset_on_reconnect_followed_by_no_endpoints_exists
passed already
but outbound_destinations_reset_on_reconnect_followed_by_add_none
and outbound_destinations_reset_on_reconnect_followed_by_remove_none
failed. Now all these tests pass.
Fixes#573
Signed-off-by: Brian Smith <brian@briansmith.org>
In order to ensure we catch discovery and routing issues arising from different logic for HTTP/1 and HTTP/2 requests, I've modified tests/discovery.rs to run all applicable tests with both HTTP/1 and HTTP/2 requests. The tests themselves are largely unchanged, but now there are separate modules containing HTTP/1 and HTTP/2 versions of a majority of the tests.
Commit 569d6939a7 introduced a regression that caused the proxy to stop using the Destination service for outbound HTTP/1 requests with no authority in the request URI but a valid authority in the `Host:` header.
The bug is due to some code in `Outbound::recognize` which assumed that a request had already been passed through `normalize_our_view_of_uri`. This was valid at one point while I was writing #492, as URIs were normalized prior to `recognize` and a request `Extension` was used to mark that they had been rewritten, and the host header and request URI could be assumed to be in agreement, but after merging #514 into the dev branch for #492, this behaviour changed and I forgot to update the logic in `recognize`.
I've fixed the issue by adding the logic for routing on `Host:` headers back into `Outbound::recognize`.
@seanmonstar added a test in `discovery.rs`, `outbound_http1_asks_controller_about_host`, which should exercise this case. I've added a couple more unit tests in that file to try and ensure we cover more of the different cases that can occur here.
Fixes#552
This was caused by the fact that a new instance of `env_logger::init()`
was added after the PR that rewrote them all to `env_logger::try_init()`
was added.
Fixes#469
Signed-off-by: Brian Smith <brian@briansmith.org>
Stop initializing env_logger in every test. In env_logger 0.5, it
may only be initialized once per process.
Also, Prost will soon upgrade to env_logger 0.5 and this will
(eventually) help reduce the number of versions of env_logger we
have to build. Turning off the regex feature will (eventually) also
reduce the number of dependencies we have to build. Unfortunately,
as it is now, the number of dependencies has increased because
env_logger increased its dependencies in 0.5.
Signed-off-by: Brian Smith <brian@briansmith.org>
Closes#403.
When the Destination service does not return a result for a service, the proxy connection for that service will hang indefinitely waiting for a result from Destination. If, for example, the requested name doesn't exist, this means that the proxy will wait forever, rather than responding with an error.
I've added a timeout wrapping the service returned from `<Outbound as Recognize>::bind_service`. The timeout can be configured by setting the `CONDUIT_PROXY_BIND_TIMEOUT` environment variable, and defaults to 10 seconds (because that's the default value for [a similar configuration in Linkerd](https://linkerd.io/config/1.3.5/linkerd/index.html#router-parameters)).
Testing with @klingerf's reproduction from #403:
```
curl -sIH 'Host: httpbin.org' $(minikube service proxy-http --url)/get | head -n1
HTTP/1.1 500 Internal Server Error
```
proxy logs:
```rust
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy using controller at HostAndPort { host: Domain("proxy-api.conduit.svc.cluster.local"), port: 8086 }
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy routing on V4(127.0.0.1:4140)
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy proxying on V4(0.0.0.0:4143) to None
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy::transport::connect "controller-client", DNS resolved proxy-api.conduit.svc.cluster.local to 10.0.0.240
proxy-5698f79b66-8rczl conduit-proxy ERR! conduit_proxy::map_err turning service error into 500: Inner(Timeout(Duration { secs: 10, nanos: 0 }))
```
The proxy will check that the requested authority looks like a local service, and if it doesn't, it will no longer ask the Destination service about the request, instead just using the SO_ORIGINAL_DST, enabling egress naturally.
The rules used to determine if it looks like a local service come from this comment:
> If default_zone.is_none() and the name is in the form $a.$b.svc, or if !default_zone.is_none() and the name is in the form $a.$b.svc.$default_zone, for some a and some b, then use the Destination service. Otherwise, use the IP given.
The proxy will now try to detect what protocol new connections are
using, and route them accordingly. Specifically:
- HTTP/2 stays the same.
- HTTP/1 is now accepted, and will try to send an HTTP/1 request
to the target.
- If neither HTTP/1 nor 2, assume a TCP stream and simply forward
between the source and destination.
* tower-h2: fix Server Clone bounds
* proxy: implement Async{Read,Write} extra methods for Connection
Closes#130Closes#131
We’ve built Conduit from the ground up to be the fastest, lightest,
simplest, and most secure service mesh in the world. It features an
incredibly fast and safe data plane written in Rust, a simple yet
powerful control plane written in Go, and a design that’s focused on
performance, security, and usability. Most importantly, Conduit
incorporates the many lessons we’ve learned from over 18 months of
production service mesh experience with Linkerd.
This repository contains a few tightly-related components:
- `proxy` -- an HTTP/2 proxy written in Rust;
- `controller` -- a control plane written in Go with gRPC;
- `web` -- a UI written in React, served by Go.