linkerd2-proxy

Commit Graph

Author	SHA1	Message	Date
Eliza Weisman	1678d6b33b	proxy: Change `DEFAULT_OUTBOUND_ROUTER_CAPACITY` from 10,000 to 100 (#1060 ) The proxy can't actually support 10K clients currently (for one, we can't open 10K resolution streams to the destination service). 100 is a more-realistic but sufficiently-high default.	2018-06-04 14:34:34 -07:00
Eliza Weisman	896fe75929	proxy: Reload TLS config on changes (#1056 ) This PR modifies the proxy's TLS code so that the TLS config files are reloaded when any of them has changed (including if they did not previously exist). If reloading the configs returns an error, we log an error and continue using the old config. Currently, this is implemented by polling the file system for the time they were last modified at a fixed interval. However, I've implemented this so that the changes are passed around as a `Stream`, and that reloading and updating the config is in a separate function the one that detects changes. Therefore, it should be fairly easy to plug in support for `inotify` (and other FS watch APIs) later, as long as we can use them to generate a `Stream` of changes. Closes #369 Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-06-04 13:36:28 -07:00
Eliza Weisman	f73e34e0d8	proxy: Update `dns` module to use new Trust-DNS `AsyncResolver` API (#1032 ) Depends on #974. Closes #859. This PR updates the proxy's `dns` module to use the new `AsyncResolver` API I added to `trust-dns-resolver` in bluejekyll/trust-dns#487. This allows us to spawn one `Future` that will drive DNS resolution in the background, rather than having to repeatedly clone a heavyweight `ResolverFuture` for every lookup. It also means that we no longer have to clone the name to resolve in quite as many places. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-06-01 14:36:37 -07:00
Eliza Weisman	0e29e2f5cf	proxy: Honor TTLs for DNS responses (#974 ) Closes #711. Depends on #967. This PR changes the proxy's `destination` module to honor the TTLs associated with DNS lookups, now that bluejekyll/trust-dns#444 has been merged and we can access this information from the Trust-DNS Resolver API. The `destination::background::DestinationSet` type has been modified so that, when a successful result is received for a DNS query, the DNS server will be polled again after the deadline associated with that query, rather than after a fixed deadline. The fixed deadline is still used to determine when to poll again for negative DNS responses or for errors. Furthermore, Conduit now accepts an optional CONDUIT_PROXY_DNS_MIN_TTL environment variable that will configure a minimum TTL for DNS results. If the deadline of a DNS response gives it a TTL shorter than the configured minimum, Conduit will not poll DNS again until after that minimum TTL is elapsed. By default, there is no minimum value set, as this feature is intended primarily for when Conduit is run locally for development purposes. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-06-01 12:17:48 -07:00
Sean McArthur	fbbe28d032	proxy: update h2 to cancel reset requests (#1051 ) This includes the changes that should detect when a client sends a `RST_STREAM`, and cancels our pending response future or streaming body. Closes #986	2018-06-01 02:53:21 +02:00
Brian Smith	044754ee24	Add initial infrastructure for optionally accepting TLS connections (#1047 ) * Add initial infrastructure for optinally accepting TLS connections. If the environment gives us the paths to the certificate chain and private key then use TLS for all accepted TCP connections. Otherwise, continue on using plaintext for all accepted TCP connections. The default behavior--no TLS--isn't changed. Later we'll make this smarter by adding protocol detection so that when the TLS configuration is available, we'll accept both TLS and non-TLS connections. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-05-31 12:20:57 -10:00
Eliza Weisman	be9486c239	proto: Add TLS identity to WeightedAddr message (#1041 ) Required for #1008. This PR adds the `TlsIdentity` message to the Destination service proto, to describe what strategy the proxy should use for verifying an endpoint's TLS certificates. It also adds a `TlsIdentity` field to the `WeightedAddr` message. Currently, there is one possible variant for `TlsIdentity`, `KubernetesPodName`, which consists of the Kubernetes pod name of the endpoint, the namespace of the endpoint, and the namespace of that pod's Conduit control plane. The proxy should attempt to connect over TLS if the control plane namespace matches its own control plane namespace. The pod name and namespace are used to verify the endpoint's TLS certificate. See https://github.com/runconduit/conduit/issues/386#issuecomment-392948046. This change was initially part of #1008, but I factored it out to make the diff smaller. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-31 11:48:25 -07:00
Oliver Gould	30ae471dda	proxy: Add rich logging contexts (#1037 ) While debugging proxy issues, I found it necessary to change how logging contexts are instrumented, especially for clients. This change moves away from using `Debug` types to in favor of `Display` types. Furthermore, the `logging` module now provides a uniform set of logging contexts to be used throughout the application. All clients, servers, and background tasks should now be instrumented so that their log messages contain predictable metadata. Some small improvements have been made to ensure that logging contexts are correct when a `Future` is dropped (which is important for some H2 uses, especially).	2018-05-30 13:41:59 -07:00
Oliver Gould	91075e7d32	proxy: Fix bench tests and require bench tests in CI (#1038 ) b3170af changed the DstLabels api, but the bench test was not updated accordingly. Furthermore, since bench tests require a nightly rust version, we've avoided running them in CI. This makes it easy for these tests to break, however. This updates the benches/record.rs. Additionally, in CI, we pin the rust nightly' version to a known-good version so that we can reliably run these bench test without the fear of external changes breaking our build.	2018-05-30 07:20:28 -07:00
Oliver Gould	1c8916550e	proxy: Ensure labels are reliably ordered (#1030 ) The proxy receives a hash map of endpoint labels from the destination service. As this map is serialized into a string, its keys and values do not have a stable ordering. To fix this, we sort the keys for all labels before constructing an instance of `DstLabels`. This change was much more difficult to test than it was to fix, so tests this change was tested manually. Fixes #1015	2018-05-30 07:13:26 -07:00
Eliza Weisman	ec72012982	proxy: Remove dynamic label updating on bound services (#1006 ) Depends on tower-rs/tower#75. Required for #386 In order for the proxy to use the TLS support metadata from the Destination service correctly, we determined that the code for dynamically changing the labels on an already-bound service should be removed, and any change in metadata should cause an endpoint to be rebound. I've modified the proxy so that we no longer update the labels using `futures-watch` (as a sidenote, we no longer depend on that crate). Metadata update events now cause the `tower-discover::Discover` implementation for `DestinationSet` to re-insert the changed endpoint into the load balancer. Upstream PR tower-rs/tower#75 in tower-balance changes the load balancer to honor duplicate insertions by replacing the old endpoint rather than ignoring them; that change is necessary for the tests to pass on this branch. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-29 12:48:59 -07:00
Brian Smith	4ede9a7ef3	Fix location of raw pointer comment in `ContextGuard`. (#1027 ) Commit b861a6df317c937123825098a7ef0b50cf52e281 moved the code the comment was describing, but didn't move the comment. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-05-26 18:30:37 -10:00
Brian Smith	79a38327d2	Abstract I/O interface into a trait. (#1020 ) * Rename so_original_dst.rs to addr_info.rs. Prepare for expanding the functionality of this module by renaming it. Signed-off-by: Brian Smith <brian@briansmith.org> * Abstract I/O interface into a trait. Instead of pattern matching over an `Io` variant, use a `Box<Io>` to abstract the I/O interface. This will make it easier to add a TLS transport. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-05-26 10:04:31 -10:00
Eliza Weisman	4cca72fb92	proxy: Fix missing logging contexts on inbound/outbound (#1025 ) Changes to `BoundPort::listen_and_fold` inadvertently broke the `::logging::context_future`s on the `serve` futures for the Inbound and outbound proxies, leading to log messages that didn't have the appropriate context. This fixes that. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-25 16:31:33 -07:00
Brian Smith	7764e97a25	Prepare `BoundPort::listen_and_fold` for upcoming TLS work. (#1018 ) Refactor `listen_and_fold()` to make it possible to insert more futures into the chain before the folding. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-05-25 09:59:05 -10:00
Oliver Gould	0e6c4c1450	proxy: Record EOS when bodies are dropped (#1003 ) It appears that hyper does not necessarily poll bodies to completion, and instead simply drops a body as soon as `content-length` is reached (hyperium/hyper#1521). This change implements Drop for MeasuredBody such that the stream-end event is triggered if it had not been triggered previously. This ensures that response latencies and counts are recorded for HTTP/1 streams. Fixes #994	2018-05-24 10:40:29 -07:00
Oliver Gould	4fdbd1631a	proxy: Fix h1 body implementation (#995 ) In the h1-h2 glue code, we incorrectly called `is_empty()` to determine if an H1 stream had ended. `is_empty` only returns true if there was no body at all (rather than if the body has been fully consumed). By changing this to call `hyper::body::Payload::is_end_stream`, h1 bodies now behave the same as h2 bodies. Relates to #994	2018-05-24 07:23:14 -07:00
Oliver Gould	1d5ef1e4d5	proxy: Record HTTP latency at first data frame (#981 ) Currently, the proxy records a request's latency as the time between when a request is opened and when its response stream completes. This is not what we intend to record, especially when a response is long-lived. In order to more accurate record latency, we want to track the time at which the first response body frame is received (which is a close approximation of time-to-first-byte). Telemetry aggregation has been changed to use the first-frame time to compute latencies; tests have been updated to exercise this behavior; and the metrics documentation has been updated to reflect this change. Addresses #818 Relates to #980	2018-05-23 16:02:44 -07:00
Carl Lerche	f41e74fd2c	Proxy: Bump h2 version to v0.1.8 (#990 ) Signed-off-by: Carl Lerche <me@carllerche.com>	2018-05-23 12:45:14 -07:00
Oliver Gould	afa7fef976	proxy: Alter telemetry to use discrete instants (#980 ) Proxy tasks emit events to the telemetry system. These events are used aggregate counts and latencies, as well as to inform Tap requests. Initially, these events included durations, describing the relevant time that elapsed between this event and another. This approach is somewhat inflexible -- it unnecessarily constrains the set of measurements that can computed in the telemetry system. To remedy this, the `Event` types can be changed to report discrete `Instant`s (rather than `Duration`s). Then, when latencies are computed in the telemetry system, these discrete instants can be compared to produce durations. There are no functional changes in this PR.	2018-05-22 14:57:00 -07:00
Eliza Weisman	d709ec37e3	proxy: Remove configure-and-bind-to-executor pattern (#967 ) A common pattern when using the old Tokio API was separating the configuration of a task from binding it to an executor to run on. This was often necessary when we wanted to construct a type corresponding to some task before the reactor on which it would execute was initialized. Typically, this was accomplished with two separate types, one of which represented the configuration and exposed only a method to take a reactor `Handle` and transform it to the other type, representing the actual task. After we migrate to the new Tokio API in #944, executors no longer need to be passed explictly, as we can use `DefaultExecutor::current` or `current_thread::TaskExecutor::current` to spawn a task on the current executor. Therefore, a lot of this complexity can be refactored away. This PR refactors the `Config` and `Process` structs in i`control::destination::background` into a single `Background` struct, and removes the `dns::Config` and `telemetry::MakeControl` structs (`dns::Resolver` and `telemetry::Control` are now constructed directly). It should not cause any functional changes. Closes #966 Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-21 15:40:33 -07:00
Eliza Weisman	7bf4c1bc41	proxy: Use `impl Trait` to unbox some futures (#969 ) Now that `impl Trait` is stable, we don't need to box as many futures. We still need to box before spawning them on an executor, but the component futures no longer require their own boxes. Signed-off-by: Eliza Weisman <eliza@buoyant.io	2018-05-19 13:19:05 -07:00
Eliza Weisman	1b1623dd83	proxy: Upgrade Conduit to use the new version of Tokio (#944 ) Closes #888. Closes #867. This branch upgrades Conduit to use the new Tokio API. It was also necessary to upgrade some other dependencies (including `hyper`, and `trust-dns`) alongside this upgrade. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-17 16:38:15 -07:00
Sean McArthur	d9fd091411	proxy: rebind services on connect errors (#952 ) Instead of having connect errors destroy all buffered requests, this changes Bind to return a service that can rebind itself when there is a connect error. It won't try to establish the new connection itself, but waits for the buffer to poll again. Combing this with changes in tower-buffer to remove canceled requests from the buffer should mean that we won't loop on connect errors for forever. Signed-off-by: Sean McArthur <sean@seanmonstar.com>	2018-05-17 14:15:16 -07:00
Oliver Gould	cd923abf94	proxy: Drop destination resolutions when unused (#956 ) A proxy dispatches requests over a constrained number of routes. When the router's upper bound is reached---and potentially in other future scenarios---router capacity is created by removing unused routes, their load balancers, and all related endpoint stacks. However, in the current regime, the controller subsystem will continue to monitor discovery observations. As the number of active observations expands over time, the controller task ends up with more and more work to do. This change introduces a shared atomic boolean between the resolution returned to the load balancer and the state maintained when communicating with the service. Before the controller polls its active resolutions, it first ensures that all unused resolutions are dropped.	2018-05-16 17:28:11 -07:00
Eliza Weisman	4473fd114d	proxy: Fix end events not firing when a stream is ended by a DATA frame (#957 ) A recent upstream change in `tower-h2` (tower-rs/tower-h2@d9b3140) caused some HTTP/2 streams that were previously terminated by TRAILERS frames to be terminated by empty DATA frames with the end of stream bit set, instead. This broke some tests in my dev branch for #944, as our test server also uses `tower-h2`, and some of the metrics tests were no longer seeing the expected `StreamResponseEnd` events due to this change. This issue may also occur in other cases, resulting in incorrect metrics. This PR changes `MeasuredBody::poll_data` to trigger the Stream End event if it sees a DATA frame that ends the stream. Fixes #954 Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-16 16:24:29 -07:00
Oliver Gould	b23ed6e651	rfc: proxy: Split `control::discovery` into submodules (#955 ) While preparing #946, I was again struck by the `discovery` module being very weighty (nearly 800 dense lines). The intent of this change is only to improve readability. There are no functional changes. The following aesthetic changes have been made: * `control::discovery` has been renamed to `control::destination` to be more consistent with the rest of conduit's terminology (destinations aren't the only thing that need to be discovered). * In that vein, the `Discovery` type has been renamed `Resolver` (since it exposes one function, `resolve`). * The `Watch` type has been renamed `Resolution`. This disambiguates the type form `futures_watch::Watch`(which is used in the same code) and makes it more clearly the product of a `Resolver`. * The `Background` and `DiscoveryWork` names were very opaque. `Background` is now `background::Config` to indicate that it can't actually _do_ anything; and `DiscoveryWork` is now `background::Process` to indicate that it's responsible for processing destination updates. * `DestinationSet` is now a private implementation detail in the `background` module. * An internal `ResolveRequest` type replaces an unnamed tuple (now that it's used across files). * `rustfmt` has been run on `background.rs` and `endpoint.rs`	2018-05-15 17:23:01 -07:00
Eliza Weisman	86a75907ca	proxy: Move absolute URI detection to `bind::Protocol` (#938 ) This is in preparation for landing the Tokio upgrade. In the upcoming Hyper release, the handling of absolute form request URIs moved from `hyper::Request` to the `hyper::client::connect::Connect` trait. Once we upgrade to the new Tokio, we will have to upgrade our Hyper dependency as well. Currently, Conduit detects whether the request URI is in absolute form in `h1::normalize_our_view_of_uri` and adds an extension to the request if it is. This will no longer work with the new Hyper, as that function is called from the `bind::NormalizeUri` service, which is not constructed until after the client connection is established. Therefore, it is necessary to move this information to `bind::Protocol`, so that it can be passed to `transparency::client::HyperConnect` (our implementation of Hyper's `Connect` trait) when we are using the newest Hyper. For now, however, I've left in the `UriIsAbsoluteForm` extension and continued to set it in `h1::normalize_our_view_of_uri`, since we currently still use it on the current Hyper version. I thought it was good to minimize the changes to this existing code, as it will be removed when we migrate to the new Hyper. This PR shouldn't cause any functional changes. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-15 13:16:58 -07:00
Carl Lerche	be1b06fa22	Proxy: Update h2 dependency. (#948 ) This also updates `bytes` as the newest version is required by `h2`. Signed-off-by: Carl Lerche <me@carllerche.com>	2018-05-14 11:23:25 -07:00
Eliza Weisman	57c8504899	proxy: Add a lazy version of ThreadRng (#936 ) This is in preparation for landing the Tokio upgrade. In order to be generic over Tokio's current thread and threadpool executors, a number of types in Conduit which were not previously `Send` are now required to be `Send`. A majority of this work will be done in the main Tokio upgrade PR, as it is in many cases not possible to make these types `Send` _without_ using the new Tokio API (in order to remove `Handle`s, etc.); however, I'm factoring out everything possible and trying to land it in separate PRs. The p2c load balancer constructed in `Outbound` is currently parameterized over a random number generator. We currently construct it by getting the thread-local RNG, and passing it to the load balancer constructor. However, the thread-local RNG is not `Send`. I've fixed this issue by creating a new zero-sized empty struct type which implements `rand::Rng` simply by calling `thread_rng()` every time its' called, and passing that to `choose::power_of_two_choices` instead. Since this is an empty type which contains no data, and the correct thread-local RNG is accessed whenever the methods are called, this new type can trivially be `Send`. According to the `rand` crate's documentation, this is the correct way to use `ThreadRng` anyway: > Retrieve the lazily-initialized thread-local random number generator, seeded > by the system. Intended to be used in method chaining style, e.g. > `thread_rng().gen::<i32>()`. > (from https://docs.rs/rand/0.4.2/rand/fn.thread_rng.html) This shouldn't lead to any functional changes. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-11 17:26:23 -07:00
Eliza Weisman	281281f5bc	proxy: Make `outbound_updates_newer_services` test forward-compatible (#939 ) This is in preparation for landing the Tokio upgrade. The test `discovery::outbound_updates_newer_services` currently contains an assertion that an HTTP/2 request to an HTTP/1 service will return a response with status code 500. This is because the current version of Hyper on which Conduit depends does not support protocol upgrades. However, commit hyperium/hyper@bc6af88a32, which adds support for this kind of protocol upgrade, was recently merged to Hyper's master branch. Therefore, this assertion will no longer be correct once we depend on the upcoming Hyper release. When we migrate to the new Tokio, it will be necessary to upgrade our Hyper dependency as well, and this test will fail. I've modified the test to no longer make assertions about the response's status code, so that it's compatible with both the current and future Hyper versions. If the response is not `Ok`, the test will still fail, since `tests::support::Client::request()` `expect`s that the response is successful, but the status code is ignored. I've added a comment in the test explaining this. Eventually, when the master version of Conduit depends on the latest Hyper, we may want to change this test to assert that the status code is 200 instead. We may also want to add more tests for Hyper's protocol upgrade functionality, but that seems out of scope for this PR. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-05-11 14:36:03 -07:00
Oliver Gould	030bd404fa	proxy/router: Implement LRU cache eviction (#925 ) The router's cache has no means to evict unused entries when capacity is reached. This change does the following: - Wraps cache values in a smart pointer that tracks the last time of access for each entry. The smart pointer updates the access time when the reference to entry is dropped. - When capacity is not available, all nodes that have not been accessed within some minimal idle age are dropped. Accesses and updates to the map are O(1) when capacity is available. Reclaiming capacity is O(n), so it's expected that the router is configured with enough capacity such that capacity need not be reclaimed usually.	2018-05-10 19:06:31 -07:00
Oliver Gould	1842418c97	proxy/router: Create a separate `cache` module (#920 ) The router's `Inner` type contains a map of routes. Recently, this map's capacity has become constrained to prevent leakage for long-running processes. This change prepares for a fuller LRU implementation by moving the router's `Inner` type to a new (tested) module, `cache`.	2018-05-10 14:00:50 -07:00
Oliver Gould	b238d97137	router: Store `recognize` outside of lock (#913 ) The router stores its cache and `Recognize` implementation within a `Mutex`, but there is no need for the recognizer to be locked. This change creates a new `Cache` type that is locked independently of `Recognize`. In order to accomplish this, `Recognize::bind_service` has been changed to take an immutable reference to its `self`. The (unused) `Single` type has been removed because it relied on `bind_service` being mutable.	2018-05-10 11:51:01 -07:00
Oliver Gould	ee839d5ba6	proxy: Remove try_bind_route! macro (#915 ) The macro is now only used once, so it seems clearer just to inline the logic.	2018-05-09 14:38:20 -07:00
Sean McArthur	011d2541eb	proxy: change peek to use reads for eventual support of TLS (#901 )	2018-05-08 18:19:12 -07:00
Oliver Gould	50cb2f84db	rfc: Make DestinationServiceQuery generic (#749 ) The goals of this change are: 1. Reduce the size/complexity of `control::discovery` in order to ease code reviews. 2. Extract a reusable grpc streaming utility. There are no intended functional changes. `control::discovery::DestinationServiceQuery` is used to track the state of a request (and streaming response) to the destination service. Very little of this logic is specific to the destination service. The `DestinationServiceQuery` and associated `UpdateRx` type have been moved to a new module, `control::remote_stream`, as `Remote` and `Receiver`, respectively. Both of these types are generic over the gRPC message type, so it will be possible to use this utility with additional API endpoints. The `Receiver::poll` implementation has been simplified to be more idiomatic with the rest of our code (namely, using `try_ready!`).	2018-05-08 16:54:20 -07:00
Oliver Gould	2392b3df2d	proxy: Parse units with duration configurations (#909 ) Configuration values that take durations are currently specified as time values with no units. So `600` may mean 600ms in some contexts and 10 minutes in others. In order to avoid this problem, this change now requires that configurations provide explicit units for time values such as '600ms' or 10 minutes'. Fixes #27.	2018-05-08 13:54:12 -07:00
Oliver Gould	f97cc718dd	proxy: Use Duration types for config defaults (#906 ) It's easy to misconfigure default durations, since they're recorded as integers and converted to Durations separately. Now, all default constants that represent durations use const `Duration` instances (enabled by a recent Rust release). This fixes #905 which was caused by using the wrong time unit for the metrics retain time.	2018-05-08 10:58:22 -07:00
Oliver Gould	3d6586a19f	proxy: Track SingleUse services against router capacity (#902 ) PR #898 introduces capacity limits to the balancer. However, because the router supports "single-use" routes--routes that are bound only for the life of a single HTTP1 request--it is easy for a router to exceed its configured capacity. In order to fix this, the `Reuse` type is removed from the router library so that _all_ routes are considered cacheable. It's now the responsibility of the bound service to enforce policies with regards to client retention. Routes were not added to the cache when the service could not be used to process more than a single request. Now, `Bind` wraps its returned services (via the `Binding` type), that dictate whether a single client is reused or if one is bound for each request. This enables all routes to be cached without changing behavior with regards to connection reuse.	2018-05-08 10:57:56 -07:00
Oliver Gould	a80da120ad	proxy: Bound on router capacity (#898 ) Currently, the proxy may cache an unbounded number of routes. In order to prevent such leaks in production, new configurations are introduced to limit the number of inbound and outbound HTTP routes. By default, we support 100 inbound routes and 10K outbound routes. In a followup, we'll introduce an eviction strategy so that capacity can be reclaimed gracefully.	2018-05-04 16:32:30 -07:00
Oliver Gould	a8d55b5293	proxy: Refactor router implementation (#894 ) The Router's primary `call` implementation is somewhat difficult to follow. This change does not introduce any functional changes, but makes the function easier to reason about. This is being done in preparation for functional changes.	2018-05-02 15:47:36 -07:00
Oliver Gould	bdc19d926c	proxy: Upgrade tower dependencies (#892 ) In order to pick up https://github.com/tower-rs/tower-grpc/pull/60, upgrade tower dependencies. This will reduce the cost of updating for upcoming tower-h2 improvements.	2018-05-02 13:40:55 -07:00
Eliza Weisman	a85434551f	Add unit tests for `metrics::record` (#890 ) This PR adds unit tests for `metrics::record`, based on the benchmarks for the same function. Currently, there is a test that fires a single response end event and asserts that the metrics state is correct afterward, and a test that fires all the events to simulate a full connection lifetime, and asserts that the metrics state is correct afterward. I'd like to also add a test that simulates multiple events with different labels, but I'll add that in a subsequent PR, In order to add these tests, it was necessary to to add test-only accessors to make some `metrics` structs `pub`` so that the test can access them. I also added some test-only functions to `metrics::Histogram`s, to make them easier to make assertions about.	2018-05-02 13:26:27 -07:00
Oliver Gould	77017eedea	proxy: Fix Tap ID generation (#885 ) The proxy's tap server assigns a sequential numeric ID to each inbound Tap request to assist tap lifecycle management. The server implementation keeps a local counter to keep track of tap IDs. However, this implementation is cloned for each individual tap requests, so `0` the only tap ID ever used. This change moves the Tap ID to be stored in a shared atomic integer. Debug logging has been improved as well.	2018-05-01 11:59:45 -07:00
Eliza Weisman	18e6eafb85	proxy: Fix metrics constructor in benches (#881 ) Fixes a test compilation error.	2018-04-30 17:48:07 -07:00
Oliver Gould	810f6bb719	proxy: Expire metrics that have not been updated for 10 minutes (#880 ) The proxy is now configured with the CONDUIT_PROXY_METRICS_RETAIN_IDLE environment variable that dictates the amount of time that the proxy will retain metrics that have not been updated. A timestamp is maintained for each unique set of labels, indicating the last time that the scope was updated. Then, when metrics are read, all metrics older than CONDUIT_PROXY_METRICS_RETAIN_IDLE are dropped from the stats registry. A ctx::test_utils module has been added to aid testing. Fixes #819	2018-04-30 16:11:12 -07:00
Oliver Gould	01aba7c711	proxy: Group metrics by label (#879 ) Previously, we maintained a map of labels for each metric. Because the same keys are used in multiple scopes, this causes redundant hashing & map lookup when updating metrics. With this change, there is now only one map per unique label scope and all of the metrics for each scope are stored in the value. This makes metrics inserting faster and prepares for eviction of idle metrics. The Metric type has been split into Metric, which now only holds metric metadata and is responsible for printing a given metric, and Scopes which holds groupings of metrics by label. The metrics! macro is provided to make it easy to define Metric instances statically.	2018-04-30 15:33:09 -07:00
Oliver Gould	c63f0a1976	proxy: Make each metric type responsible for formatting (#878 ) In order to set up for a refactor that removes the `Metric` type, the `FmtMetric` trait--implemented by `Counter`, `Gauge`, and `Histogram`--is introduced to push prometheus formatting down into each type. With this change, the `Histogram` type now relies on `Counter` (and its metric formatting) more heavily.	2018-04-30 13:00:21 -07:00
Oliver Gould	29330b0dc1	Move `metrics::Serve` into its own module (#877 ) With this change, metrics/mod.rs now contains only metrics types.	2018-04-30 10:52:08 -07:00

1 2 3 4 5

205 Commits All Branches Search

205 Commits

All Branches