During code review of another change I noticed that a lot of types seem
to derive `Hash` (and `Eq`, `PartialEq`) even though the types should
never (for performance reasons) be used as keys of a hash table, and
where it is kind of questionable what equality should mean for those
types. Then I noticed that similarly many types implement `Clone` even
though I expect we should never be cloning them, again because of our
performance goals.
Because these types derive these traits, then whenever we add a field
to them, that field also has to implement these traits. That means we
then have to expand the problem, deriving implementations of these
traits for types that don't otherwise want/need to implement these
traits. This makes review complicated, because, for example, we have
to decide whether something should be compared case-insensitively or
case-sensitively when really we don't want to compare those things at
all.
To prove that we can get by by doing less, to speed up code review
(particularly related to some stuff related to TLS), stop deriving
`Clone`, `Eq`, `PartialEq`, and `Hash` for these types.
I believe that, in particular, the change to key the Tap hash table
based on request ID, instead of the whole request, should speed up
the tap feature since we don't hash and/or compare every field,
recursively, of requests.
Later more such cleanup of this sort should be done.
Signed-off-by: Brian Smith <brian@briansmith.org>
Depends on #1006. Depends on #1041.
This PR adds a `tls_identity` field to the endpoint `Metadata` struct, which
contains the `TlsIdentity` metadata sent by the control plane's Destination
service.
I changed the `ctx::transport::Client` context struct to hold a `Metadata`,
rather than just the labels, so the TLS support determination is always
available. In addition, I've added it as an additional parameter to
`transport::Connect::new`, so that when we create a new connection, the TLS
code will be able to determine whether or not TLS is supported and, if it is,
how to verify the endpoint's identity.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
b3170af changed the DstLabels api, but the bench test was not updated
accordingly.
Furthermore, since bench tests require a nightly rust version, we've
avoided running them in CI. This makes it easy for these tests to break, however.
This updates the benches/record.rs. Additionally, in CI, we pin the rust nightly'
version to a known-good version so that we can reliably run these bench test
without the fear of external changes breaking our build.
Currently, the proxy records a request's latency as the time between
when a request is opened and when its response stream completes. This is
not what we intend to record, especially when a response is long-lived.
In order to more accurate record latency, we want to track the time at
which the first response body frame is received (which is a close
approximation of time-to-first-byte).
Telemetry aggregation has been changed to use the first-frame time to
compute latencies; tests have been updated to exercise this behavior; and
the metrics documentation has been updated to reflect this change.
Addresses #818
Relates to #980
Proxy tasks emit events to the telemetry system. These events are used
aggregate counts and latencies, as well as to inform Tap requests.
Initially, these events included durations, describing the relevant time
that elapsed between this event and another.
This approach is somewhat inflexible -- it unnecessarily constrains the
set of measurements that can computed in the telemetry system.
To remedy this, the `Event` types can be changed to report discrete
`Instant`s (rather than `Duration`s). Then, when latencies are computed
in the telemetry system, these discrete instants can be compared to
produce durations.
There are no functional changes in this PR.
Before changing the telemetry implementation, we should have a means to
understand the impacts of such changes.
To run, you must use a nightly toolchain:
```
rustup run nightly cargo bench -p conduit-proxy -- record
```