`tower-balance` has been updated with a Peak-EWMA load balancer; and a
new crate, `tower-h2-balance` has been introduced to make the load
balancer aware of some H2 stream events.
The Peak-EWMA balancer is designed to reduce tail latency by maintaining
an Exponentially Weighted Moving Average of latencies to each endpoint
which decay over a 10s window.
This commit adds the initial wiring to forward TLS config changes to the
watches used by TLS clients as well as TLS servers. As the TLS clients
are not yet implemented, the config type is currently `()`, but once
the client config is implemented, we should be able to drop it in
seamlessly.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Brian Smith <brian@briansmith.org>
Depends on #1032.
This branch makes some additional changes to the proxy's DNS code. In
particular, since we no longer need to clone the resolver on every lookup,
it removes some `clone()` calls in `DestinationSet::reset_dns_query`.
I've also changed the DNS futures to use the new contextual logging code
on master.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Fix non-Linux builds.
The change to signal.rs is needed for Windows.
The change to config.rs is needed for Windows and maybe other platforms.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Proxy: Better encapsulate the details of TLS config watching.
Encapsulate more of the TLS configuration logic in the TLS submodule. This allows
for easier refactoring. In particular, this will make adding the client TLS configuration
easier.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Display font-awesome icons no matter what URL is originally loaded
The URLs in the dashboard need to be relative. Unfortunately, this means that if
you load something that isn't the base route ... font-awesome icons look broken.
There's no real way to solve this from within webpack (or the web server without
some work). Instead, just load font-awesome from a CDN as there's no real
benefit we get from including it in the bundle. Fixes#1019.
* Moving font-awesome to styles
This branch adds an inotify-based implementation of filesystem watches
for the TLS config files. On Linux, where inotify is available, this is
used instead of the polling-based code I added in #1056 and #1076.
In order to avoid the issues detecting changes to files in Kubernetes
ConfigMaps described in #1061, we watch the directory _containing_ the
files we care about rather than the files themselves. I've tested this
manually in Docker for Mac Kubernetes and can confirm that ConfigMap
changes are detected successfully.
Closes#1061. Closes#369.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Proxy: Map Kubernetes Pod Namespace/Name to TLS identity.
Map the Kubernetes identity into a DNS name that can be used to
validate the peer's certificate. The final mapping is TBD; the
important thing for now is that the mapped name doesn't collide
with any real DNS name.
Encapsulate the mapping logic within the TLS submodule.
Minimize `Arc`ing and `Clone`ing of TLS identities.
This has no effect in default configurations since the settings that
enable the functionality are not set by default.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Handle an edge case when using bin/web
There's a weird error running `bin/web dev` if you don't have conduit installed on a kubernetes cluster. Nothing in the docs mention that you need to work on this.
Output a user friendly error when we can't find a pod and update the docs to remind folks to install conduit first. Fixes#1070
* Wrap text, send to stderr, fail when missing
This branch changes the polling-based implementation of TLS config file watches
to fully canonicalize the path to each config file prior to polling for its
metadata. Doing so fixes the issues detecting changes when the watched path is
a symbolic link to another symbolic link (see #1061), which is how Kubernetes
implements ConfigMaps mounted as volumes.
I've manually tested this with Conduit running in Docker for Mac Kubernetes,
by volume-mounting a ConfigMap containing the TLS config files, and
regenerating, deleting, and adding the certificates. Watching the Conduit logs
confirms that the changes are now successfully detected.
Note that we have to re-canonicalize the path every time we poll the filesystem
for metadata. Otherwise, if the file is a symlink and the link target changes,
we will continue polling the _old_ link target's path, and fail to detect any
changes to the _new_ link target.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
webpki's DNSName type does full validation and normalization (lowercasing) of
DNS names, which is exactly what `dns::Name` does. webpki's DNSName type
considers a DNS name to be valid according to the rules for TLS certificates,
which is slightly stricter than what a DNS library might otherwise allow. In
anticipation of possible compatibility issues, introduce separate tls::DnsName
and dns::Name names for this type. In the future, if we find that tls::DnsName
is too strict for non-TLS cases, we can have these types diverge without
affecting TLS validation.
Signed-off-by: Brian Smith <brian@briansmith.org>
Encapsulate HTTP request ID generation logic.
Request IDs need to be globally unique, so there can only be one request ID
sequence per process. Simplify the request ID generation with that in mind,
and make it more efficient.
Signed-off-by: Brian Smith <brian@briansmith.org>
Both the conduit stat command and web UI are showing failed and completed pods.
This change filters out those pods before returning the result to the client.
Fixes#1010
Signed-off-by: Ivan Sim <ihcsim@gmail.com>
During code review of another change I noticed that a lot of types seem
to derive `Hash` (and `Eq`, `PartialEq`) even though the types should
never (for performance reasons) be used as keys of a hash table, and
where it is kind of questionable what equality should mean for those
types. Then I noticed that similarly many types implement `Clone` even
though I expect we should never be cloning them, again because of our
performance goals.
Because these types derive these traits, then whenever we add a field
to them, that field also has to implement these traits. That means we
then have to expand the problem, deriving implementations of these
traits for types that don't otherwise want/need to implement these
traits. This makes review complicated, because, for example, we have
to decide whether something should be compared case-insensitively or
case-sensitively when really we don't want to compare those things at
all.
To prove that we can get by by doing less, to speed up code review
(particularly related to some stuff related to TLS), stop deriving
`Clone`, `Eq`, `PartialEq`, and `Hash` for these types.
I believe that, in particular, the change to key the Tap hash table
based on request ID, instead of the whole request, should speed up
the tap feature since we don't hash and/or compare every field,
recursively, of requests.
Later more such cleanup of this sort should be done.
Signed-off-by: Brian Smith <brian@briansmith.org>
Depends on #1006. Depends on #1041.
This PR adds a `tls_identity` field to the endpoint `Metadata` struct, which
contains the `TlsIdentity` metadata sent by the control plane's Destination
service.
I changed the `ctx::transport::Client` context struct to hold a `Metadata`,
rather than just the labels, so the TLS support determination is always
available. In addition, I've added it as an additional parameter to
`transport::Connect::new`, so that when we create a new connection, the TLS
code will be able to determine whether or not TLS is supported and, if it is,
how to verify the endpoint's identity.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The proxy can't actually support 10K clients currently (for one, we can't open
10K resolution streams to the destination service). 100 is a more-realistic
but sufficiently-high default.
This PR modifies the proxy's TLS code so that the TLS config files are reloaded
when any of them has changed (including if they did not previously exist).
If reloading the configs returns an error, we log an error and continue using
the old config.
Currently, this is implemented by polling the file system for the time they
were last modified at a fixed interval. However, I've implemented this so
that the changes are passed around as a `Stream`, and that reloading and
updating the config is in a separate function the one that detects changes.
Therefore, it should be fairly easy to plug in support for `inotify` (and
other FS watch APIs) later, as long as we can use them to generate a
`Stream` of changes.
Closes#369
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Depends on #974. Closes#859.
This PR updates the proxy's `dns` module to use the new `AsyncResolver` API I
added to `trust-dns-resolver` in bluejekyll/trust-dns#487. This allows us to
spawn one `Future` that will drive DNS resolution in the background, rather
than having to repeatedly clone a heavyweight `ResolverFuture` for every
lookup. It also means that we no longer have to clone the name to resolve in
quite as many places.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Closes#711. Depends on #967.
This PR changes the proxy's `destination` module to honor the TTLs associated
with DNS lookups, now that bluejekyll/trust-dns#444 has been merged and we can
access this information from the Trust-DNS Resolver API.
The `destination::background::DestinationSet` type has been modified so that,
when a successful result is received for a DNS query, the DNS server will be
polled again after the deadline associated with that query, rather than after
a fixed deadline. The fixed deadline is still used to determine when to poll
again for negative DNS responses or for errors.
Furthermore, Conduit now accepts an optional CONDUIT_PROXY_DNS_MIN_TTL
environment variable that will configure a minimum TTL for DNS results. If the
deadline of a DNS response gives it a TTL shorter than the configured minimum,
Conduit will not poll DNS again until after that minimum TTL is elapsed. By
default, there is no minimum value set, as this feature is intended primarily
for when Conduit is run locally for development purposes.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add initial infrastructure for optinally accepting TLS connections.
If the environment gives us the paths to the certificate chain and private key
then use TLS for all accepted TCP connections. Otherwise, continue on using
plaintext for all accepted TCP connections. The default behavior--no TLS--isn't
changed.
Later we'll make this smarter by adding protocol detection so that when the TLS
configuration is available, we'll accept both TLS and non-TLS connections.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Web: remove ns column from tables on individual ns page
* Add prop types and tests for MetricsTable component
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Required for #1008.
This PR adds the `TlsIdentity` message to the Destination service proto,
to describe what strategy the proxy should use for verifying an endpoint's TLS
certificates. It also adds a `TlsIdentity` field to the `WeightedAddr` message.
Currently, there is one possible variant for `TlsIdentity`, `KubernetesPodName`,
which consists of the Kubernetes pod name of the endpoint, the namespace of
the endpoint, and the namespace of that pod's Conduit control plane. The proxy
should attempt to connect over TLS if the control plane namespace matches its
own control plane namespace. The pod name and namespace are used to verify
the endpoint's TLS certificate.
See https://github.com/runconduit/conduit/issues/386#issuecomment-392948046.
This change was initially part of #1008, but I factored it out to make the diff
smaller.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add propType validation
When refactoring components, it is hard to know what is required and isn't.
Adds propTypes to the existing components and enables eslint errors for anything
moving forward. This should keep us documenting the API for components.
* Remove extra newline
While debugging proxy issues, I found it necessary to change how logging contexts are
instrumented, especially for clients.
This change moves away from using `Debug` types to in favor of `Display` types.
Furthermore, the `logging` module now provides a uniform set of logging contexts to be
used throughout the application. All clients, servers, and background tasks should now be
instrumented so that their log messages contain predictable metadata.
Some small improvements have been made to ensure that logging contexts are correct
when a `Future` is dropped (which is important for some H2 uses, especially).
b3170af changed the DstLabels api, but the bench test was not updated
accordingly.
Furthermore, since bench tests require a nightly rust version, we've
avoided running them in CI. This makes it easy for these tests to break, however.
This updates the benches/record.rs. Additionally, in CI, we pin the rust nightly'
version to a known-good version so that we can reliably run these bench test
without the fear of external changes breaking our build.
The proxy receives a hash map of endpoint labels from the destination
service. As this map is serialized into a string, its keys and values
do not have a stable ordering.
To fix this, we sort the keys for all labels before constructing an
instance of `DstLabels`.
This change was much more difficult to test than it was to fix, so tests
this change was tested manually.
Fixes#1015
On the individual namespace pages, the filter should not be shown, as all results that appear on the that page will be for on namespace.
Added a boolean property, showNamespaceFilter, to MetricsTable that allows you to define if the filter should be shown.
Tested that the filter is not shown on namespace pages.
Fixes#972
Signed-off-by: Kim Christensen <kimworking@gmail.com>
As part of the HOC + Context merges, ResourceList missed out on the api injection and errors out on the Namespaces tab.
Wrap the returned HOC in `withContext` to make sure it is there, no matter where it is in the tree. (Fixes#1034)
Depends on tower-rs/tower#75. Required for #386
In order for the proxy to use the TLS support metadata from the Destination
service correctly, we determined that the code for dynamically changing the
labels on an already-bound service should be removed, and any change in
metadata should cause an endpoint to be rebound.
I've modified the proxy so that we no longer update the labels using
`futures-watch` (as a sidenote, we no longer depend on that crate). Metadata
update events now cause the `tower-discover::Discover` implementation for
`DestinationSet` to re-insert the changed endpoint into the load balancer.
Upstream PR tower-rs/tower#75 in tower-balance changes the load balancer
to honor duplicate insertions by replacing the old endpoint rather than
ignoring them; that change is necessary for the tests to pass on this branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* conduit inject: Add flag to set proxy bind timeout (#863)
* fix test
* fix flag to get it working with #909
* Add time parsing
* Use the variable to set the default value
Signed-off-by: Sacha Froment <sfroment42@gmail.com>
* Add a HOC for the REST API tooling
We're copying and duplicating logic all over the place with components that need to talk to the API.
Moves most of the REST API tooling into a HOC that can be used by other components. Now, a component can use `withREST`, pass in the promises that it would like resolved and receive the responses as props.
* Show PageHeader whether there's an error or not
* Hiding page header during loading
* Test updates to work with namespace restructuring
* Rename so_original_dst.rs to addr_info.rs.
Prepare for expanding the functionality of this module by renaming it.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Abstract I/O interface into a trait.
Instead of pattern matching over an `Io` variant, use a `Box<Io>` to
abstract the I/O interface. This will make it easier to add a TLS
transport.
Signed-off-by: Brian Smith <brian@briansmith.org>
Changes to `BoundPort::listen_and_fold` inadvertently broke the
`::logging::context_future`s on the `serve` futures for the Inbound and
outbound proxies, leading to log messages that didn't have the appropriate
context. This fixes that.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
In an effort to highlight the namespace overview pages, remove the Deployments,
Replication Controllers and Pods items from the sidebar and replace them with direct
links to individual Namespace pages. If the user has more than 8 namespaces, only
list the first 8 (the rest can be accessed by the namespace list page).
The Deployments/RCs/Pods endpoints are still available if you go directly to
/deployments, /pods, etc. but they're not highlighted to the user.
The Grafana dashboards currently show Request Volume by ns/deploy/pod.
Add a `meshed` dimension to the Request Volume graphs, in anticipation
of the `meshed`/`secured` label from the proxy. Also increase `irate`
time window queries from `20s` to `30s`, per recommendation from
Prometheus team.
Relates to #388.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Refactor `listen_and_fold()` to make it possible to insert more futures
into the chain before the folding.
Signed-off-by: Brian Smith <brian@briansmith.org>