When developing features in the proxy, that rely on Linux-only OS features,
developers using other operating systems may find it inconvenient to test
their changes. While we run CI builds on Linux, and may have access to Linux
testing environments, this is not as tightly integrated into the proxy
development workflow as running a quick `cargo test` on the host OS.
For example, I found it inconvenient to test the `inotify` based filesystem
watch code I've been adding in recent commits, and had to do things like
opening a WIP PR for a branch to get CI to run the tests. This workflow is not
ideal.
This PR adds an (admittedly somewhat hacky) script and Dockerfile for running
the proxy's tests in Docker. This accomplishes approximately the same goal as
the `PROXY_SKIP_TESTS` flag that we used to have, but with the advantage that
we no longer include the test dependencies in release builds.
Of course, this also means that we no longer share any of the dependencies
between the test docker build and the release docker build, which is a shame.
It might be worthwhile to re-introduce a dependencies image so that cached
builds of the proxy's dependencies can be shared between the test and release
Dockerfiles. However, I thought that deserved to be discussed separately from
the changes I made in this branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Common blacklists have `/api/stat` in them. This causes the dashboard to not load.
`/api/tps-reports` is not in any blacklists, suggests what this route does and is slightly tongue in cheek. Fixes#970
Previously, in conduit stat all we would just print the map of stat results, which
resulted in the order in which stats were displayed varying between prints.
Fix:
Define an array, k8s.StatAllResourceTypes and use the order in this array to print
the map; ensuring a consistent print order every time the command is run.
`tower-balance` has been updated with a Peak-EWMA load balancer; and a
new crate, `tower-h2-balance` has been introduced to make the load
balancer aware of some H2 stream events.
The Peak-EWMA balancer is designed to reduce tail latency by maintaining
an Exponentially Weighted Moving Average of latencies to each endpoint
which decay over a 10s window.
This commit adds the initial wiring to forward TLS config changes to the
watches used by TLS clients as well as TLS servers. As the TLS clients
are not yet implemented, the config type is currently `()`, but once
the client config is implemented, we should be able to drop it in
seamlessly.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Brian Smith <brian@briansmith.org>
Depends on #1032.
This branch makes some additional changes to the proxy's DNS code. In
particular, since we no longer need to clone the resolver on every lookup,
it removes some `clone()` calls in `DestinationSet::reset_dns_query`.
I've also changed the DNS futures to use the new contextual logging code
on master.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Fix non-Linux builds.
The change to signal.rs is needed for Windows.
The change to config.rs is needed for Windows and maybe other platforms.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Proxy: Better encapsulate the details of TLS config watching.
Encapsulate more of the TLS configuration logic in the TLS submodule. This allows
for easier refactoring. In particular, this will make adding the client TLS configuration
easier.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Display font-awesome icons no matter what URL is originally loaded
The URLs in the dashboard need to be relative. Unfortunately, this means that if
you load something that isn't the base route ... font-awesome icons look broken.
There's no real way to solve this from within webpack (or the web server without
some work). Instead, just load font-awesome from a CDN as there's no real
benefit we get from including it in the bundle. Fixes#1019.
* Moving font-awesome to styles
This branch adds an inotify-based implementation of filesystem watches
for the TLS config files. On Linux, where inotify is available, this is
used instead of the polling-based code I added in #1056 and #1076.
In order to avoid the issues detecting changes to files in Kubernetes
ConfigMaps described in #1061, we watch the directory _containing_ the
files we care about rather than the files themselves. I've tested this
manually in Docker for Mac Kubernetes and can confirm that ConfigMap
changes are detected successfully.
Closes#1061. Closes#369.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Proxy: Map Kubernetes Pod Namespace/Name to TLS identity.
Map the Kubernetes identity into a DNS name that can be used to
validate the peer's certificate. The final mapping is TBD; the
important thing for now is that the mapped name doesn't collide
with any real DNS name.
Encapsulate the mapping logic within the TLS submodule.
Minimize `Arc`ing and `Clone`ing of TLS identities.
This has no effect in default configurations since the settings that
enable the functionality are not set by default.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Handle an edge case when using bin/web
There's a weird error running `bin/web dev` if you don't have conduit installed on a kubernetes cluster. Nothing in the docs mention that you need to work on this.
Output a user friendly error when we can't find a pod and update the docs to remind folks to install conduit first. Fixes#1070
* Wrap text, send to stderr, fail when missing
This branch changes the polling-based implementation of TLS config file watches
to fully canonicalize the path to each config file prior to polling for its
metadata. Doing so fixes the issues detecting changes when the watched path is
a symbolic link to another symbolic link (see #1061), which is how Kubernetes
implements ConfigMaps mounted as volumes.
I've manually tested this with Conduit running in Docker for Mac Kubernetes,
by volume-mounting a ConfigMap containing the TLS config files, and
regenerating, deleting, and adding the certificates. Watching the Conduit logs
confirms that the changes are now successfully detected.
Note that we have to re-canonicalize the path every time we poll the filesystem
for metadata. Otherwise, if the file is a symlink and the link target changes,
we will continue polling the _old_ link target's path, and fail to detect any
changes to the _new_ link target.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
webpki's DNSName type does full validation and normalization (lowercasing) of
DNS names, which is exactly what `dns::Name` does. webpki's DNSName type
considers a DNS name to be valid according to the rules for TLS certificates,
which is slightly stricter than what a DNS library might otherwise allow. In
anticipation of possible compatibility issues, introduce separate tls::DnsName
and dns::Name names for this type. In the future, if we find that tls::DnsName
is too strict for non-TLS cases, we can have these types diverge without
affecting TLS validation.
Signed-off-by: Brian Smith <brian@briansmith.org>
Encapsulate HTTP request ID generation logic.
Request IDs need to be globally unique, so there can only be one request ID
sequence per process. Simplify the request ID generation with that in mind,
and make it more efficient.
Signed-off-by: Brian Smith <brian@briansmith.org>
Both the conduit stat command and web UI are showing failed and completed pods.
This change filters out those pods before returning the result to the client.
Fixes#1010
Signed-off-by: Ivan Sim <ihcsim@gmail.com>
During code review of another change I noticed that a lot of types seem
to derive `Hash` (and `Eq`, `PartialEq`) even though the types should
never (for performance reasons) be used as keys of a hash table, and
where it is kind of questionable what equality should mean for those
types. Then I noticed that similarly many types implement `Clone` even
though I expect we should never be cloning them, again because of our
performance goals.
Because these types derive these traits, then whenever we add a field
to them, that field also has to implement these traits. That means we
then have to expand the problem, deriving implementations of these
traits for types that don't otherwise want/need to implement these
traits. This makes review complicated, because, for example, we have
to decide whether something should be compared case-insensitively or
case-sensitively when really we don't want to compare those things at
all.
To prove that we can get by by doing less, to speed up code review
(particularly related to some stuff related to TLS), stop deriving
`Clone`, `Eq`, `PartialEq`, and `Hash` for these types.
I believe that, in particular, the change to key the Tap hash table
based on request ID, instead of the whole request, should speed up
the tap feature since we don't hash and/or compare every field,
recursively, of requests.
Later more such cleanup of this sort should be done.
Signed-off-by: Brian Smith <brian@briansmith.org>
Depends on #1006. Depends on #1041.
This PR adds a `tls_identity` field to the endpoint `Metadata` struct, which
contains the `TlsIdentity` metadata sent by the control plane's Destination
service.
I changed the `ctx::transport::Client` context struct to hold a `Metadata`,
rather than just the labels, so the TLS support determination is always
available. In addition, I've added it as an additional parameter to
`transport::Connect::new`, so that when we create a new connection, the TLS
code will be able to determine whether or not TLS is supported and, if it is,
how to verify the endpoint's identity.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The proxy can't actually support 10K clients currently (for one, we can't open
10K resolution streams to the destination service). 100 is a more-realistic
but sufficiently-high default.
This PR modifies the proxy's TLS code so that the TLS config files are reloaded
when any of them has changed (including if they did not previously exist).
If reloading the configs returns an error, we log an error and continue using
the old config.
Currently, this is implemented by polling the file system for the time they
were last modified at a fixed interval. However, I've implemented this so
that the changes are passed around as a `Stream`, and that reloading and
updating the config is in a separate function the one that detects changes.
Therefore, it should be fairly easy to plug in support for `inotify` (and
other FS watch APIs) later, as long as we can use them to generate a
`Stream` of changes.
Closes#369
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Depends on #974. Closes#859.
This PR updates the proxy's `dns` module to use the new `AsyncResolver` API I
added to `trust-dns-resolver` in bluejekyll/trust-dns#487. This allows us to
spawn one `Future` that will drive DNS resolution in the background, rather
than having to repeatedly clone a heavyweight `ResolverFuture` for every
lookup. It also means that we no longer have to clone the name to resolve in
quite as many places.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Closes#711. Depends on #967.
This PR changes the proxy's `destination` module to honor the TTLs associated
with DNS lookups, now that bluejekyll/trust-dns#444 has been merged and we can
access this information from the Trust-DNS Resolver API.
The `destination::background::DestinationSet` type has been modified so that,
when a successful result is received for a DNS query, the DNS server will be
polled again after the deadline associated with that query, rather than after
a fixed deadline. The fixed deadline is still used to determine when to poll
again for negative DNS responses or for errors.
Furthermore, Conduit now accepts an optional CONDUIT_PROXY_DNS_MIN_TTL
environment variable that will configure a minimum TTL for DNS results. If the
deadline of a DNS response gives it a TTL shorter than the configured minimum,
Conduit will not poll DNS again until after that minimum TTL is elapsed. By
default, there is no minimum value set, as this feature is intended primarily
for when Conduit is run locally for development purposes.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add initial infrastructure for optinally accepting TLS connections.
If the environment gives us the paths to the certificate chain and private key
then use TLS for all accepted TCP connections. Otherwise, continue on using
plaintext for all accepted TCP connections. The default behavior--no TLS--isn't
changed.
Later we'll make this smarter by adding protocol detection so that when the TLS
configuration is available, we'll accept both TLS and non-TLS connections.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Web: remove ns column from tables on individual ns page
* Add prop types and tests for MetricsTable component
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Required for #1008.
This PR adds the `TlsIdentity` message to the Destination service proto,
to describe what strategy the proxy should use for verifying an endpoint's TLS
certificates. It also adds a `TlsIdentity` field to the `WeightedAddr` message.
Currently, there is one possible variant for `TlsIdentity`, `KubernetesPodName`,
which consists of the Kubernetes pod name of the endpoint, the namespace of
the endpoint, and the namespace of that pod's Conduit control plane. The proxy
should attempt to connect over TLS if the control plane namespace matches its
own control plane namespace. The pod name and namespace are used to verify
the endpoint's TLS certificate.
See https://github.com/runconduit/conduit/issues/386#issuecomment-392948046.
This change was initially part of #1008, but I factored it out to make the diff
smaller.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add propType validation
When refactoring components, it is hard to know what is required and isn't.
Adds propTypes to the existing components and enables eslint errors for anything
moving forward. This should keep us documenting the API for components.
* Remove extra newline
While debugging proxy issues, I found it necessary to change how logging contexts are
instrumented, especially for clients.
This change moves away from using `Debug` types to in favor of `Display` types.
Furthermore, the `logging` module now provides a uniform set of logging contexts to be
used throughout the application. All clients, servers, and background tasks should now be
instrumented so that their log messages contain predictable metadata.
Some small improvements have been made to ensure that logging contexts are correct
when a `Future` is dropped (which is important for some H2 uses, especially).
b3170af changed the DstLabels api, but the bench test was not updated
accordingly.
Furthermore, since bench tests require a nightly rust version, we've
avoided running them in CI. This makes it easy for these tests to break, however.
This updates the benches/record.rs. Additionally, in CI, we pin the rust nightly'
version to a known-good version so that we can reliably run these bench test
without the fear of external changes breaking our build.
The proxy receives a hash map of endpoint labels from the destination
service. As this map is serialized into a string, its keys and values
do not have a stable ordering.
To fix this, we sort the keys for all labels before constructing an
instance of `DstLabels`.
This change was much more difficult to test than it was to fix, so tests
this change was tested manually.
Fixes#1015
On the individual namespace pages, the filter should not be shown, as all results that appear on the that page will be for on namespace.
Added a boolean property, showNamespaceFilter, to MetricsTable that allows you to define if the filter should be shown.
Tested that the filter is not shown on namespace pages.
Fixes#972
Signed-off-by: Kim Christensen <kimworking@gmail.com>
As part of the HOC + Context merges, ResourceList missed out on the api injection and errors out on the Namespaces tab.
Wrap the returned HOC in `withContext` to make sure it is there, no matter where it is in the tree. (Fixes#1034)