During code review of another change I noticed that a lot of types seem
to derive `Hash` (and `Eq`, `PartialEq`) even though the types should
never (for performance reasons) be used as keys of a hash table, and
where it is kind of questionable what equality should mean for those
types. Then I noticed that similarly many types implement `Clone` even
though I expect we should never be cloning them, again because of our
performance goals.
Because these types derive these traits, then whenever we add a field
to them, that field also has to implement these traits. That means we
then have to expand the problem, deriving implementations of these
traits for types that don't otherwise want/need to implement these
traits. This makes review complicated, because, for example, we have
to decide whether something should be compared case-insensitively or
case-sensitively when really we don't want to compare those things at
all.
To prove that we can get by by doing less, to speed up code review
(particularly related to some stuff related to TLS), stop deriving
`Clone`, `Eq`, `PartialEq`, and `Hash` for these types.
I believe that, in particular, the change to key the Tap hash table
based on request ID, instead of the whole request, should speed up
the tap feature since we don't hash and/or compare every field,
recursively, of requests.
Later more such cleanup of this sort should be done.
Signed-off-by: Brian Smith <brian@briansmith.org>
Depends on #1006. Depends on #1041.
This PR adds a `tls_identity` field to the endpoint `Metadata` struct, which
contains the `TlsIdentity` metadata sent by the control plane's Destination
service.
I changed the `ctx::transport::Client` context struct to hold a `Metadata`,
rather than just the labels, so the TLS support determination is always
available. In addition, I've added it as an additional parameter to
`transport::Connect::new`, so that when we create a new connection, the TLS
code will be able to determine whether or not TLS is supported and, if it is,
how to verify the endpoint's identity.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
The proxy can't actually support 10K clients currently (for one, we can't open
10K resolution streams to the destination service). 100 is a more-realistic
but sufficiently-high default.
This PR modifies the proxy's TLS code so that the TLS config files are reloaded
when any of them has changed (including if they did not previously exist).
If reloading the configs returns an error, we log an error and continue using
the old config.
Currently, this is implemented by polling the file system for the time they
were last modified at a fixed interval. However, I've implemented this so
that the changes are passed around as a `Stream`, and that reloading and
updating the config is in a separate function the one that detects changes.
Therefore, it should be fairly easy to plug in support for `inotify` (and
other FS watch APIs) later, as long as we can use them to generate a
`Stream` of changes.
Closes#369
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Depends on #974. Closes#859.
This PR updates the proxy's `dns` module to use the new `AsyncResolver` API I
added to `trust-dns-resolver` in bluejekyll/trust-dns#487. This allows us to
spawn one `Future` that will drive DNS resolution in the background, rather
than having to repeatedly clone a heavyweight `ResolverFuture` for every
lookup. It also means that we no longer have to clone the name to resolve in
quite as many places.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Closes#711. Depends on #967.
This PR changes the proxy's `destination` module to honor the TTLs associated
with DNS lookups, now that bluejekyll/trust-dns#444 has been merged and we can
access this information from the Trust-DNS Resolver API.
The `destination::background::DestinationSet` type has been modified so that,
when a successful result is received for a DNS query, the DNS server will be
polled again after the deadline associated with that query, rather than after
a fixed deadline. The fixed deadline is still used to determine when to poll
again for negative DNS responses or for errors.
Furthermore, Conduit now accepts an optional CONDUIT_PROXY_DNS_MIN_TTL
environment variable that will configure a minimum TTL for DNS results. If the
deadline of a DNS response gives it a TTL shorter than the configured minimum,
Conduit will not poll DNS again until after that minimum TTL is elapsed. By
default, there is no minimum value set, as this feature is intended primarily
for when Conduit is run locally for development purposes.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add initial infrastructure for optinally accepting TLS connections.
If the environment gives us the paths to the certificate chain and private key
then use TLS for all accepted TCP connections. Otherwise, continue on using
plaintext for all accepted TCP connections. The default behavior--no TLS--isn't
changed.
Later we'll make this smarter by adding protocol detection so that when the TLS
configuration is available, we'll accept both TLS and non-TLS connections.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Web: remove ns column from tables on individual ns page
* Add prop types and tests for MetricsTable component
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
Required for #1008.
This PR adds the `TlsIdentity` message to the Destination service proto,
to describe what strategy the proxy should use for verifying an endpoint's TLS
certificates. It also adds a `TlsIdentity` field to the `WeightedAddr` message.
Currently, there is one possible variant for `TlsIdentity`, `KubernetesPodName`,
which consists of the Kubernetes pod name of the endpoint, the namespace of
the endpoint, and the namespace of that pod's Conduit control plane. The proxy
should attempt to connect over TLS if the control plane namespace matches its
own control plane namespace. The pod name and namespace are used to verify
the endpoint's TLS certificate.
See https://github.com/runconduit/conduit/issues/386#issuecomment-392948046.
This change was initially part of #1008, but I factored it out to make the diff
smaller.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* Add propType validation
When refactoring components, it is hard to know what is required and isn't.
Adds propTypes to the existing components and enables eslint errors for anything
moving forward. This should keep us documenting the API for components.
* Remove extra newline
While debugging proxy issues, I found it necessary to change how logging contexts are
instrumented, especially for clients.
This change moves away from using `Debug` types to in favor of `Display` types.
Furthermore, the `logging` module now provides a uniform set of logging contexts to be
used throughout the application. All clients, servers, and background tasks should now be
instrumented so that their log messages contain predictable metadata.
Some small improvements have been made to ensure that logging contexts are correct
when a `Future` is dropped (which is important for some H2 uses, especially).
b3170af changed the DstLabels api, but the bench test was not updated
accordingly.
Furthermore, since bench tests require a nightly rust version, we've
avoided running them in CI. This makes it easy for these tests to break, however.
This updates the benches/record.rs. Additionally, in CI, we pin the rust nightly'
version to a known-good version so that we can reliably run these bench test
without the fear of external changes breaking our build.
The proxy receives a hash map of endpoint labels from the destination
service. As this map is serialized into a string, its keys and values
do not have a stable ordering.
To fix this, we sort the keys for all labels before constructing an
instance of `DstLabels`.
This change was much more difficult to test than it was to fix, so tests
this change was tested manually.
Fixes#1015
On the individual namespace pages, the filter should not be shown, as all results that appear on the that page will be for on namespace.
Added a boolean property, showNamespaceFilter, to MetricsTable that allows you to define if the filter should be shown.
Tested that the filter is not shown on namespace pages.
Fixes#972
Signed-off-by: Kim Christensen <kimworking@gmail.com>
As part of the HOC + Context merges, ResourceList missed out on the api injection and errors out on the Namespaces tab.
Wrap the returned HOC in `withContext` to make sure it is there, no matter where it is in the tree. (Fixes#1034)
Depends on tower-rs/tower#75. Required for #386
In order for the proxy to use the TLS support metadata from the Destination
service correctly, we determined that the code for dynamically changing the
labels on an already-bound service should be removed, and any change in
metadata should cause an endpoint to be rebound.
I've modified the proxy so that we no longer update the labels using
`futures-watch` (as a sidenote, we no longer depend on that crate). Metadata
update events now cause the `tower-discover::Discover` implementation for
`DestinationSet` to re-insert the changed endpoint into the load balancer.
Upstream PR tower-rs/tower#75 in tower-balance changes the load balancer
to honor duplicate insertions by replacing the old endpoint rather than
ignoring them; that change is necessary for the tests to pass on this branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
* conduit inject: Add flag to set proxy bind timeout (#863)
* fix test
* fix flag to get it working with #909
* Add time parsing
* Use the variable to set the default value
Signed-off-by: Sacha Froment <sfroment42@gmail.com>
* Add a HOC for the REST API tooling
We're copying and duplicating logic all over the place with components that need to talk to the API.
Moves most of the REST API tooling into a HOC that can be used by other components. Now, a component can use `withREST`, pass in the promises that it would like resolved and receive the responses as props.
* Show PageHeader whether there's an error or not
* Hiding page header during loading
* Test updates to work with namespace restructuring
* Rename so_original_dst.rs to addr_info.rs.
Prepare for expanding the functionality of this module by renaming it.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Abstract I/O interface into a trait.
Instead of pattern matching over an `Io` variant, use a `Box<Io>` to
abstract the I/O interface. This will make it easier to add a TLS
transport.
Signed-off-by: Brian Smith <brian@briansmith.org>
Changes to `BoundPort::listen_and_fold` inadvertently broke the
`::logging::context_future`s on the `serve` futures for the Inbound and
outbound proxies, leading to log messages that didn't have the appropriate
context. This fixes that.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
In an effort to highlight the namespace overview pages, remove the Deployments,
Replication Controllers and Pods items from the sidebar and replace them with direct
links to individual Namespace pages. If the user has more than 8 namespaces, only
list the first 8 (the rest can be accessed by the namespace list page).
The Deployments/RCs/Pods endpoints are still available if you go directly to
/deployments, /pods, etc. but they're not highlighted to the user.
The Grafana dashboards currently show Request Volume by ns/deploy/pod.
Add a `meshed` dimension to the Request Volume graphs, in anticipation
of the `meshed`/`secured` label from the proxy. Also increase `irate`
time window queries from `20s` to `30s`, per recommendation from
Prometheus team.
Relates to #388.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Refactor `listen_and_fold()` to make it possible to insert more futures
into the chain before the folding.
Signed-off-by: Brian Smith <brian@briansmith.org>
Previously, we would filter out stats coming from Conduit itself and from the kube-*
namespaces on some views in the Web UI. Remove this filtering, so that we display
all the resource information we get back from the Stat API. (Fixes#997)
On the Resource pages, the call to action would show up when there were no
metrics present, but that's actually not actionable by the user. Instead, I'm
going to show a blank table with a "no s detected" message.
* Remove special-case filtering out of kube-* namespaces, and conduit namespaces
* Remove the call to action for no metrics
* Linkify the namespace column for the resource pages
* Add integration test for conduit controller stats
* Update test for new SECURED column in stat output
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Add an app-wide context for global props.
We've been passing the `api` object down from the top of the react tree. With
16.x, there's now the ability to have context that can inject anywhere in the
tree. This creates a top level context provider that contains most of the global
variables we've been using (api, appData, ...). It subsequently cleans up some
of the routes and nested components.
- Bumps `react-dom` to 16.3.2 (to match `react`).
- Adds `enzyme-context-patch` for now. This is fixed in enzyme master, but there
has not been a release yet. Needs to be removed when that is fixed.
* Use a default inside appData for controllerNamespace
* Update syntax of if to use curly brackets
- Update the `response_total` prometheus query of the StatSummary endpoint to also
break queries out by a `meshed` label.
- Add a 'Secured' column to the web UI/CLI stat displays, which indicate the percentage of traffic
starting and ending in the mesh
This meshed label is used in the CLI/Web UI to display a column of the percentage of traffic that
starts/ends in the mesh. (Which is a proxy indicator for whether that traffic is 'secured' when we
add TLS by default for intra mesh requests).
The `meshed` label is not yet added anywhere, so until it is supplied by the proxy, all traffic will
show up as 0% secured in the web/CLI.
It appears that hyper does not necessarily poll bodies to completion,
and instead simply drops a body as soon as `content-length` is reached
(hyperium/hyper#1521).
This change implements Drop for MeasuredBody such that the stream-end
event is triggered if it had not been triggered previously. This ensures
that response latencies and counts are recorded for HTTP/1 streams.
Fixes#994
In the h1-h2 glue code, we incorrectly called `is_empty()` to determine
if an H1 stream had ended. `is_empty` only returns true if there was no
body at all (rather than if the body has been fully consumed).
By changing this to call `hyper::body::Payload::is_end_stream`, h1
bodies now behave the same as h2 bodies.
Relates to #994
Currently, the proxy records a request's latency as the time between
when a request is opened and when its response stream completes. This is
not what we intend to record, especially when a response is long-lived.
In order to more accurate record latency, we want to track the time at
which the first response body frame is received (which is a close
approximation of time-to-first-byte).
Telemetry aggregation has been changed to use the first-frame time to
compute latencies; tests have been updated to exercise this behavior; and
the metrics documentation has been updated to reflect this change.
Addresses #818
Relates to #980
The proxy does not yet support a `meshed` label.
In anticipation of a `meshed` label in the proxy, introduce this label
in `simulate-proxy`, for testing.
Relates to #306 and #386.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
secured -> meshed
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The Grafana dashboards were explicitly filtering out Conduit
control-plane data.
Remove control-plane filtering from Grafana dashboards. This brings
Grafana in-line with web, and also encourages better dog-fooding of our
proxy metrics and dashboards. Also update Grafana to 5.1.3, update the
BUILD.md architecture diagram to include Promethues and Grafana, and
introduce a Prometheus Benchmark dashboard, courtesy of Robust
Perception.
Fixes#908
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The StatSummary endpoint was dereferencing
StatSummaryRequest.Selector.Resource, causing a panic when it received
an empty request.
Fix StatSummary to use the nil-friendly
StatSummaryRequest.GetSelector().GetResource() methods, and add a test
to validate.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Update web dockerfile to use dev deps when building prod assets
* Don't re-run yarn install as pre-req for build/run/test
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>