Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.
This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.
From the finagle guide:
The algorithm randomly picks two nodes from the set of ready endpoints
and selects the least loaded of the two. By repeatedly using this
strategy, we can expect a manageable upper bound on the maximum load of
any server.
The maximum load variance between any two servers is bound by
ln(ln(n))` where `n` is the number of servers in the cluster.
Signed-off-by: Oliver Gould <ver@buoyant.io>
The proxy depends on `protoc`-generated gRPC bindings to communicate
with the controller. In order to generate these bindings, build-time
dependencies must be compiled.
In order to support a more granular, cacheable build scheme, a new crate
has been created to house these gRPC bindings,
`conduit-proxy-controller-grpc`.
Because `TryFrom` and `TryInto` conversions are implemented for
protobuf-defined types, the `convert` module also had to be moved to
into a dedicated crate.
Furthermore, because the proxy's tests require that
`quickcheck::Aribtrary` be implemented for protobuf types, the
`conduit-proxy-controller-grpc` crate supports an _arbitrary_ feature
fla protobuf types, the `conduit-proxy-controller-grpc` crate supports
an _arbitrary_ feature flag.
While we're moving these libraries around, the `tower-router` crate has
been moved to `proxy/router` and renamed to `conduit-proxy-router.`
`futures-mpsc-lossy` has been moved into the proxy directory but has not
been renamed.
Finally, the `proxy/Dockerfile-deps` image has been updated to avoid the
wasteful building of dependency artifacts, as they are not actually used
by `proxy/Dockerfile`.
The conduit repo includes several library projects that have since been
moved into external repos, including `tower-grpc` and `tower-h2`.
This change removes these vendored libraries in favor of using the new
external crates.
The proxy will now try to detect what protocol new connections are
using, and route them accordingly. Specifically:
- HTTP/2 stays the same.
- HTTP/1 is now accepted, and will try to send an HTTP/1 request
to the target.
- If neither HTTP/1 nor 2, assume a TCP stream and simply forward
between the source and destination.
* tower-h2: fix Server Clone bounds
* proxy: implement Async{Read,Write} extra methods for Connection
Closes#130Closes#131
See #132. This PR adds a protocol field to the ClientTransport and ServerTransport messages, and modifies the proxy to report a value for this field (currently, it's only ever HTTP).
Currently, HTTP/1 and HTTP/2 are collapsed into one Protocol variant, see #132 (comment). I expect that we can treat H1 as a subset of H2 as far as metrics goes.
Note that after discussing it with @klingerf, I learned that the control plane telemetry API currently does not do anything with the ClientTransport and ServerTransport messages, so beyond regenerating the protobuf-generated code, no controller changes were actually necessary. As we actually add metrics to TCP transports, we'll want to make some additions to the telemetry API to ingest these metrics. If any metrics are shared between HTTP and raw TCP transports (say, bytes sent), we'll want to differentiate between them in Prometheus. All the metrics that the control plane currently ingests from telemetry reports are likely to be HTTP-specific (requests, responses, response latencies), or at least, do not apply to raw TCP.
Actually adding metrics to raw TCP transports will probably have to wait until there are raw TCP transports implemented in the proxy...
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Proxy: Map unqualified/partially-qualified names to FQDN
Previously we required the service to fully qualify all service names
for outbound traffic. Many services are written assuming that
Kubernetes will complete names using its DNS search path, and those
services weren't working with Conduit.
Now add an option, used by default, to fully-qualify the domain names.
Currently only Kubernetes-like name completion for services is
supported, but the configuration syntax is open-ended to allow for
alternatives in the future. Also, the auto-completion can be disabled
for applications that prefer to ensure they're always using unambiguous
names. Once routing is implemented then it is likely that (default)
routing rules will replace these hard-coded rules.
Unit tests for the name completion logic are included.
Part of the solution for #9. The changes to `conduit inject` to
actually use this facility will be in another PR.
Previously every use of `BoundPort` repeated a bunch of logic.
Move the repeated logic to `BoundPort` itself. Just remove the no-op
handshaking logic; new handshaking logic will be added to `BoundPort`
when TLS is added.
Previously the default value of this setting was in lib.rs instead of
being automatically set in `Config` like all the other defaults, which
was inconsistent and confusing.
Fix this by moving the defaulting logic to `Config`.
Validated by running the test suite.
Previously the logic related to listening for incoming TCP connections
was duplicated in several places.
Begin centralizing this logic. Future commits will centralize it
further.
No validation was done other than running the test suite.
Previously `Process` did its own environment variable parsing and did
not benefit from the improved error handling that `config` now has.
Additionally, future changes will need access to these same environment
variables in other parts of the proxy.
Move `Process`'s environment variable parsing to `config` to address
both of these issues. Now there are no uses of `env::var` outside of
`config` except for logging, which is the final desired state.
I validated this manually.
* Proxy: Use production config parsing in tests
Previosuly the testing code for the proxy was sensitive to the values
of environment variables unintentionally, because `Config` looked at
the environment variables. Also, the tests were largely avoiding
testing the production configuration parsing code since they were
doing their own parsing.
Now the tests avoid looking at environment variables other than
`ENV_LOG`, which makes them more resilient. Also the tests now parse
the settings using the same code as production use uses.
I validated this manually.
This PR adds a configurable timeout duration after which in-flight telemetry reports are dropped, cancelling the corresponding RPC request to the control plane.
I've also made the `Timeout` implementation used in `TimeoutConnect` generic, and reused it in multiple places, including the timeout for in-flight reports.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
We’ve built Conduit from the ground up to be the fastest, lightest,
simplest, and most secure service mesh in the world. It features an
incredibly fast and safe data plane written in Rust, a simple yet
powerful control plane written in Go, and a design that’s focused on
performance, security, and usability. Most importantly, Conduit
incorporates the many lessons we’ve learned from over 18 months of
production service mesh experience with Linkerd.
This repository contains a few tightly-related components:
- `proxy` -- an HTTP/2 proxy written in Rust;
- `controller` -- a control plane written in Go with gRPC;
- `web` -- a UI written in React, served by Go.