linkerd2

Commit Graph

Author	SHA1	Message	Date
Eliza Weisman	605e68dff6	Add pretty durations to panics from `assert_eventually!` (#677 ) This PR adds the pretty-printing for durations I added in #676 to the panic message from the `assert_eventually!` macro added in #669. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-04-06 10:49:17 -07:00
Brian Smith	c31f4ba993	Remove unused conversions for Destination. (#701 ) These have not been used for a while; they are dead code. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-06 07:35:35 -10:00
Brian Smith	7bc4ffd0a4	Revert "Proxy: Refactor DNS name parsing and normalization (#673 )" (#700 ) This reverts commit `311ef410a8`. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 16:49:32 -10:00
Brian Smith	1b223723bc	Revert "Proxy: Refactor poll_destination() in service discovery. (#674 )" (#698 ) This reverts commit `4fb9877b89`. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 16:36:01 -10:00
Risha Mars	2f5b5ea5f2	Start implementing conduit stat summary endpoint (#671 ) Start implementing new conduit stat summary endpoint. Changes the public-api to call prometheus directly instead of the telemetry service. Wired through to `api/stat` on the web server, as well as `conduit statsummary` on the CLI. Works for deployments only. Current implementation just retrieves requests and mesh/total pod count (so latency stats are always 0). Uses API defined in #663 Example queries the stat endpoint will eventually satisfy in #627 This branch includes commits from @klingerf * run ./bin/dep ensure * run ./bin/update-go-deps-shas	2018-04-05 17:05:06 -07:00
Brian Smith	4fb9877b89	Proxy: Refactor poll_destination() in service discovery. (#674 ) No change in behavior is intended here. Split poll_destination() into two parts, one that operates locally on the DestinationSet, and the other that operates on data that isn't wholly local to the DestinationSet. This makes the code easier to understand. This is being done in preparation for adding DNS fallback polling to poll_destination(). Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 13:05:11 -10:00
Brian Smith	311ef410a8	Proxy: Refactor DNS name parsing and normalization (#673 ) Proxy: Refactor DNS name parsing and normalization Only the destination service needs normalized names (and even then, that's just temporary). The rest of the code needs the name as it was given, except case-normalized (lowercased). Because DNS fallack isn't implemented in service discovery yet, Outbound still a temporary workaround using FullyQualifiedName to keep things working; thta will be removed once DNS fallback is implemented in service discovery. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-05 12:32:12 -10:00
Andrew Seigner	28d5007cdf	Harmonize Prometheus label usage (#690 ) The Destination service used slightly different labels than the telemetry pipeline expected, specifically, prefixed with `k8s_`. Make all Prometheus labels consistent by dropping `k8s_`. Also rename `pod_name` to `pod` for consistency with `deployement`, etc. Also update and reorganize `proxy-metrics.md` to reflect new labelling. Fixes #655 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-05 15:09:06 -07:00
Andrew Seigner	65be27c3c0	Fix ci job failing when new Docker image added (#691 ) The master ci job executes a `docker-pull master` prior to building, to bootstrap the Docker image cache. This command fails if the PR being merged to master introduces a new Docker image, for example: https://travis-ci.org/runconduit/conduit/jobs/362841328 This changes the master ci job to handle a `docker-pull master` failure gracefully. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-05 15:01:54 -07:00
Andrew Seigner	9508e11b45	Build conduit-specific Grafana Docker image (#679 ) Using a vanilla Grafana Docker image as part of `conduit install` avoided maintaining a conduit-specific Grafana Docker image, but made packaging dashboard json files cumbersome. Roll our own Grafana Docker image, that includes conduit-specific dashboard json files. This significantly decreases the `conduit install` output size, and enables dashboard integration in the docker-compose environment. Fixes #567 Part of #420 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-05 14:20:05 -07:00
Eliza Weisman	18fa42ebd0	Pretty-print durations in log messages (#676 ) This branch adds simple pretty-printing to duration in log timeout messages. If the duration is >= 1 second, it's printed in seconds with a fractional part. If the duration is less than 1 second, it is printed in milliseconds. This simple formatting may not be sufficient as a formatting rule for all cases, but should be sufficient for printing our relatively small timeouts. Log messages now look something like this: ``` ERROR 2018-04-04T20:05:49Z: conduit_proxy: turning operation timed out after 100 ms into 500 ``` Previously, they looked like this: ``` ERROR 2018-04-04T20:07:26Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 ``` I made this change partially because I wanted to make the panics from the `eventually!` macro added in #669 more readable.	2018-04-05 13:47:19 -07:00
Eliza Weisman	49bf01b0da	Add `assert_eventually!` macro to help de-flake telemetry tests (#669 ) Closes #615. Based on @olix0r's suggestion in https://github.com/runconduit/conduit/issues/613#issuecomment-376024744, this PR adds an `assert_eventually!` macro to retry an assertion a set number of times, waiting for 15 ms between retries. This is loosely based on ScalaTest's [eventually](http://doc.scalatest.org/1.8/org/scalatest/concurrent/Eventually.html). I've rewritten the flaky telemetry tests to use the `assert_eventually!` macro, to compensate for delays in the served metrics being updated between client requests and metrics scrapes.	2018-04-05 11:23:34 -07:00
Eliza Weisman	6b370b4466	Split labels out of `prometheus.rs` into its own file (#680 ) The proxy's `telemetry/metrics/prometheus.rs` file was starting to get long and hard to find one's way around in. I split the prometheus labels code out into a separate submodule and `RequestLabels` and `ResponseLabels` public. This seems like a reasonable division of the code, and the resultant files are much easier to read.	2018-04-04 15:49:17 -07:00
Oliver Gould	2dc964c583	Move control::discovery::Cache into its own module (#672 ) The proxy's control::discovery module is becoming a bit dense in terms of what it implements. In order to make this code more understandable, and to be able to use a similar caching strategy in other parts of the controller, the `control::cache` module now holds discovery's cache implementation. This module is only visible within the `control` module, and it now exposes two new public methods: `values()` and `set_reset_on_next_modification()`.	2018-04-04 14:27:04 -07:00
Eliza Weisman	01628bfa43	Fix missing comma in gRPC status code labels (#670 ) Fixes the issue caught by @olix0r in https://github.com/runconduit/conduit/pull/661#issuecomment-378431155	2018-04-04 10:41:21 -07:00
Risha Mars	d1a39ea6bf	Define a new telemetry Stat API (#663 ) * Define a new telemetry Stat API Proposal definition for a new Stat API, for the purposes of satisfying the queries proposed in #627. StatSummary will replace Stat once implemented and the original Stat deleted.	2018-04-03 14:45:58 -07:00
Brian Smith	06bf78ccdf	Use Rust 1.25 to build Docker images. (#667 ) Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-03 08:22:29 -10:00
Franziska von der Goltz	eff848a8cf	fix pod status and count in control plane dashboard (#659 ) * fix pod status and count display in control plane dashboard section: - the control plane would show terminated and stale deployments in the UI, this is confusing and might indicate errors - this filters out temrinated and failed component deploys from the UI - it is to note that pending deploys will still be counted and represented with a greyed out status dot - Fixes: #606 Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu> Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>	2018-04-03 10:39:35 -07:00
Phil Calçado	19001f8d38	Add pod-based metric_labels to destinations response (#429 ) (#654 ) * Extracted logic from destination server * Make tests follow style used elsewhere in the code * Extract single interface for resolvers * Add tests for k8s and ipv4 resolvers * Fix small usability issues * Update dep * Act on feedback * Add pod-based metric_labels to destinations response * Add documentation on running control plane to BUILD.md Signed-off-by: Phil Calcado <phil@buoyant.io> * Fix mock controller in proxy tests (#656) Signed-off-by: Eliza Weisman <eliza@buoyant.io> * Address review feedback * Rename files in the destination package Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-04-02 18:36:57 -07:00
Andrew Seigner	ee042e1943	Rename grafana viz to top-line (#666 ) The primary Grafana dashboard was named 'viz' from a prototype. Rename 'viz' to 'Top Line'. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-02 18:10:35 -07:00
Brian Smith	df9ead9c36	Use Go 1.10.1 to build all Go code. (#650 ) Go 1.10.1 is a security release. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-04-02 14:58:30 -10:00
Andrew Seigner	bf721466e3	Filter out conduit controller pods from Grafana (#657 ) The Grafana dashboards were displaying all proxy-enabled pods, including conduit controller pods. In the old telemetry pipeline filtering these out required knowledge of the controller's namespace, which the dashboards are agnostic to. This change leverages the new `conduit_io_control_plane_component` prometheus label to filter out proxy-enabled controller components. Part of #420 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-04-02 17:56:12 -07:00
Sean McArthur	47f9665b8e	proxy: allow disable protocol detection on specific ports (#648 ) - Adds environment variables to configure a set of ports that, when an incoming connection has an SO_ORIGINAL_DST with a port matching, will disable protocol detection for that connection and immediately start a TCP proxy. - Adds a default list of well known ports: SMTP and MySQL. Closes #339	2018-04-02 14:24:36 -07:00
Andrew Seigner	97546e0646	Modify simulate-proxy to be more pod-centric (#653 ) simulate-proxy uses a deployment object from kubernetes to simulate each proxy metrics endpoint. Modify simulate-proxy to instead use a pod to simulate each proxy metrics endpoint. This ensures that each metrics endpoint consistently represents a pod in kubernetes, including it's namespace, deployment, and label information. This change also adds support for: - a new `metric-ports` flag, default is `10000-10009`. - `classification`, `pod_name`, and `pod_template_hash` labels Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-30 13:28:45 -07:00
Phil Calçado	bbed49c5bd	Refactor destination service and add tests in preparation to add information about labels (#645 ) * Extracted logic from destination server * Make tests follow style used elsewhere in the code * Extract single interface for resolvers * Add tests for k8s and ipv4 resolvers * Fix small usability issues * Update dep * Act on feedback Signed-off-by: Phil Calcado <phil@buoyant.io>	2018-03-30 11:36:48 -07:00
Brian Smith	f931dec3b3	Proxy: Completely replace current set of destinations on reconnect (#632 ) Previosuly, when the proxy was disconnected from the Destination service and then reconnects, the proxy would not forget old, outdated entries in its cache of endpoints. If those endpoints had been removed while the proxy was disconnected then the proxy would never become aware of that. Instead, on the first message after a reconnection, replace the entire set of cached entries with the new set, which may be empty. Prior to this change, the new test outbound_destinations_reset_on_reconnect_followed_by_no_endpoints_exists passed already but outbound_destinations_reset_on_reconnect_followed_by_add_none and outbound_destinations_reset_on_reconnect_followed_by_remove_none failed. Now all these tests pass. Fixes #573 Signed-off-by: Brian Smith <brian@briansmith.org>	2018-03-29 16:50:08 -10:00
Andrew Seigner	8fe742e2de	Update Grafana dashboards to use new proxy metrics (#637 ) The Top-line and Deployment Grafana dashboards relied on the soon-to-be-removed telemetry pipeline metrics. Update the Grafana dashboards to query for the new, proxy-based metrics. Grafana dashboard layouts have not changed. Depends on #635 to render metrics. Part of #420. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-29 13:00:01 -07:00
Brian Smith	bae72c32ed	Proxy: Factor out Destination service connection logic (#631 ) * Proxy: Factor out Destination service connection logic Centralize the connection initiation logic for the Destination service to make it easier to maintain. Clarify that the `rx` field isn't needed prior to a (re)connect. Signed-off-by: Brian Smith <brian@briansmith.org> * Rename `rx` to `query`. Signed-off-by: Brian Smith <brian@briansmith.org> * "recoonect" -> "reconnect" Signed-off-by: Brian Smith <brian@briansmith.org>	2018-03-29 08:20:57 -10:00
Andrew Seigner	666c83e963	Add pod_name to Prometheus labels (#649 ) Previously we were using the instance label to uniquely identify a pod. This meant that getting stats by pod name would require extra queries to Kubernetes to map pod name to instance. This change adds a pod_name label to metrics at collection time. This should not affect cardinality as pod_name is invariant with respect to instance. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-29 11:07:35 -07:00
Deshi Xiao	732f3c1565	fix grafana CrashLoopBackoff on image 5.0.3 (#646 ) this is a known issue with grafana in k8s. grafana/grafana:5.0.4 was just released today. update the repo from 5.0.3 to 5.0.4 fixed issues #582 Signed-off-by: Deshi Xiao <xiaods@gmail.com>	2018-03-29 09:36:11 -07:00
Oliver Gould	6e435754a1	Improve CLI docker caching (#612 ) Currently, the CLI docker image copies the entire `controller` directory, though the CLI only requires a few of its subdirectories. This causes the CLI's docker cache to be needlessly invalidated when, for instance, a service implementation changes. By restricting the copied directories to `controller/{api,public,util}`, build caching is improved.	2018-03-29 09:29:06 -07:00
Carl Lerche	288e041b8f	proxy: Update h2 to 0.1.3 (#640 ) Signed-off-by: Carl Lerche <me@carllerche.com>	2018-03-29 09:22:54 -07:00
Franziska von der Goltz	67fac9d240	remove toggle sorting functionality from TableComponent (#630 ) remove toggle sorting functionality from TableComponent: - tables displaying metrics allowed to toggle between being sorted and unsorted when clicking the same button. This was confusing behavior for the user. - this PR removes the toggle functionality and introduces a BaseTable Component that extends antd's component without the capability to toggle - Fixes: #566 Signed-off-by: Franziska von der Goltz <franziska@vdgoltz.eu>	2018-03-28 18:01:34 -07:00
Eliza Weisman	c688cf6028	Add response classification to proxy metrics (#639 ) This PR adds a `classification` label to proxy response metrics, as @olix0r described in https://github.com/runconduit/conduit/issues/634#issuecomment-376964083. The label is either "success" or "failure", depending on the following rules: + if the response had a gRPC status code, then - gRPC status code 0 is considered a success - all others are considered failures + else if the response had an HTTP status code, then - status codes < 500 are considered success, - status codes >= 500 are considered failures + else if the response stream failed then - the response is a failure. I've also added end-to-end tests for the classification of HTTP responses (with some work towards classifying gRPC responses as well). Additionally, I've updated `doc/proxy_metrics.md` to reflect the added `classification` label. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-03-28 14:49:00 -07:00
Andrew Seigner	1ed4a93b5e	Higher velocity metrics from simulate-proxy (#635 ) simulate-proxy increments a single set of metrics on each iteration, and also randomizes http status codes, leaving counters unchanged across several collections. Modify simuilate-proxy to increment all metrics on each iteration, provide a 90% success rate, ensure a pod does not call itself, and increase proxy count from 3 to 10. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-28 13:30:02 -07:00
Kevin Lingerfelt	59c75a73a9	Add tests/utils/scripts for running integration tests (#608 ) * Add tests/utils/scripts for running integration tests Add a suite of integration tests in the `test/` directory, as well as utilities for testing in the `testutil/` directory. You can use the `bin/test-run` script to run the full suite of tests, and the `bin/test-cleanup` script to cleanup after the tests. The test/README.md file has more information about running tests. @pcalcado, @franziskagoltz, and @rmars also contributed to this change. * Create TEST.md file at the root of the repo * Update based on review feedback * Relax external service IP timeout for GKE * Update TEST.md with more info about different types of test runs * More updates to TEST.md based on review feedback Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-03-27 15:06:55 -07:00
Andrew Seigner	fe35509406	Clean up Prometheus labels scraped from proxy (#633 ) The Prometheus scrape config collects from Conduit proxies, and maps Kubernetes labels to Prometheus labels, appending "k8s_". This change keeps the resultant Prometheus labels consistent with their source Kubernetes labels. For example: "deployment" and "pod_template_hash". Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-27 15:01:08 -07:00
Eliza Weisman	40b9b345a5	All counters in proxy telemetry wrap on overflows (#603 ) In #602, @olix0r suggested that telemetry counters should wrap on overflows, as "most timeseries systems (like prometheus) are designed to handle this case gracefully." This PR changes counters to use explicitly wrapping arithmetic. Closes #602. Signed-off-by: Eliza Weisman <eliza@buoyant.io>	2018-03-27 14:03:12 -07:00
Brian Smith	7dc21f9588	Add the NoEndpoints message to the Destination API (#564 ) Have the controller tell the client whether the service exists, not just what are available. This way we can implement fallback logic to alternate service discovery mechanisms for ambigious names. Signed-off-by: Brian Smith <brian@briansmith.org> Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>	2018-03-27 10:45:41 -10:00
Eliza Weisman	e7aa3d4105	Add process_start_time_seconds Prometheus metric (#628 ) As described in #619. `process_start_time_seconds` is the idiomatic way of reporting to Prometheus the uptime of a process. It should contain the time in seconds since the beginning of the Unix epoch. The proxy now exports this metric: ``` ➜ http get localhost:4191/metrics HTTP/1.1 200 OK Content-Length: 902 Content-Type: text/plain; charset=utf-8 Date: Mon, 26 Mar 2018 22:09:55 GMT # HELP request_total A counter of the number of requests the proxy has received. # TYPE request_total counter # HELP request_duration_ms A histogram of the duration of a request. This is measured from when the request headers are received to when the request stream has completed. # TYPE request_duration_ms histogram # HELP response_total A counter of the number of responses the proxy has received. # TYPE response_total counter # HELP response_duration_ms A histogram of the duration of a response. This is measured from when theresponse headers are received to when the response stream has completed. # TYPE response_duration_ms histogram # HELP response_latency_ms A histogram of the total latency of a response. This is measured from whenthe request headers are received to when the response stream has completed. # TYPE response_latency_ms histogram process_start_time_seconds 1522102089 ``` Closes #619	2018-03-27 12:54:31 -07:00
Eliza Weisman	bdcdfa8874	Actually skip flaky tests on CI and in Docker (#626 ) Flaky proxy tests were not actually being ignored properly. This is due to our use of a Cargo workspace; as it turns out that Cargo doesn't propagate feature flags from the workspace to the crates in the workspace (see rust-lang/cargo#4753). If I run `cargo test --no-default-features` in the root directory, the `flaky_tests` feature is still passed, and the flaky tests still run: ``` ➜ cargo test --no-default-features Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs Running target/debug/deps/conduit_proxy-0e0ab2829c6b743f running 13 tests test fully_qualified_authority::tests::test_normalized_authority ... ok test ctx::transport::tests::same_addr_ip6_compat_ipv4 ... ok test ctx::transport::tests::same_addr_ipv4 ... ok test ctx::transport::tests::same_addr_ip6_mapped_ipv4 ... ok test ctx::transport::tests::same_addr_ipv6 ... ok test telemetry::tap::match_::tests::http_from_proto ... ok test inbound::tests::recognize_default_no_ctx ... ok test telemetry::tap::match_::tests::tcp_from_proto ... ok test telemetry::tap::match_::tests::tcp_matches ... ok test inbound::tests::recognize_default_no_loop ... ok test transparency::tcp::tests::duplex_doesnt_hang_when_one_half_finishes ... ok test inbound::tests::recognize_default_no_orig_dst ... ok test inbound::tests::recognize_orig_dst ... ok test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/conduit_proxy-74584a35ef749a60 running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/discovery-73cd0b65bd7a45ae running 16 tests test http1::absolute_uris::outbound_reconnects_if_controller_stream_ends ... ok test http1::outbound_reconnects_if_controller_stream_ends ... ok test http1::absolute_uris::outbound_uses_orig_dst_if_not_local_svc ... ok test http1::outbound_asks_controller_without_orig_dst ... ok test http1::absolute_uris::outbound_asks_controller_api ... ok test http1::outbound_asks_controller_api ... ok test http1::absolute_uris::outbound_asks_controller_without_orig_dst ... ok test http2::outbound_reconnects_if_controller_stream_ends ... ok test http2::outbound_asks_controller_api ... ok test http2::outbound_asks_controller_without_orig_dst ... ok test http1::outbound_uses_orig_dst_if_not_local_svc ... ok server h1 error: invalid HTTP version specified test http2::outbound_uses_orig_dst_if_not_local_svc ... ok ERROR 2018-03-26T20:54:09Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: frame with invalid size into 500 test outbound_updates_newer_services ... ok ERROR 2018-03-26T20:54:09Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http1::absolute_uris::outbound_times_out ... ok ERROR 2018-03-26T20:54:09Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http2::outbound_times_out ... ok ERROR 2018-03-26T20:54:09Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http1::outbound_times_out ... ok test result: ok. 16 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/telemetry-cb5bee2d2b94332c running 12 tests test metrics_endpoint_inbound_request_count ... ok test metrics_endpoint_inbound_request_duration ... ok test metrics_endpoint_outbound_request_count ... ok test records_latency_statistics ... ignored test telemetry_report_errors_are_ignored ... ok test metrics_endpoint_outbound_request_duration ... ok test metrics_have_no_double_commas ... ok test http1_inbound_sends_telemetry ... ok test inbound_sends_telemetry ... ok test inbound_aggregates_telemetry_over_several_requests ... ok test metrics_endpoint_inbound_response_latency ... ok test metrics_endpoint_outbound_response_latency ... ok test result: ok. 11 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out Running target/debug/deps/transparency-9d14bf92d8ba3700 running 19 tests ERROR 2018-03-26T20:54:10Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: unexpected internal error encountered into 500 test http11_upgrade_not_supported ... ok test http11_absolute_uri_differs_from_host ... ok test http10_without_host ... ok test http1_head_responses ... ok test http10_with_host ... ok test http1_connect_not_supported ... ok test http1_bodyless_responses ... ok test http1_content_length_zero_is_preserved ... ok test http1_removes_connection_headers ... ok test http1_one_connection_per_host ... ok test inbound_http1 ... ok test inbound_tcp ... ok test http1_requests_without_body_doesnt_add_transfer_encoding ... ok test http1_response_end_of_file ... ok test http1_requests_without_host_have_unique_connections ... ok test outbound_tcp ... ok test tcp_with_no_orig_dst ... ok test tcp_connections_close_if_client_closes ... ok test outbound_http1 ... ok test result: ok. 19 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/conduit_proxy_controller_grpc-7fdac3528475b1dc running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/conduit_proxy_router-024926cac5d328ee running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/convert-ae9bd3b8fee21c85 running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/futures_mpsc_lossy-4afd31454ff77b40 running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests conduit-proxy running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests conduit-proxy-controller-grpc running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests convert running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests conduit-proxy-router running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests futures-mpsc-lossy running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out ``` This also happens if the `-p` flag is used to run tests only in the `conduit-proxy` crate: ``` ➜ cargo test -p conduit-proxy --no-default-features Compiling conduit-proxy v0.3.0 (file:///Users/eliza/Code/go/src/github.com/runconduit/conduit/proxy) Finished dev [unoptimized + debuginfo] target(s) in 17.27 secs Running target/debug/deps/conduit_proxy-0e0ab2829c6b743f running 13 tests test fully_qualified_authority::tests::test_normalized_authority ... ok test ctx::transport::tests::same_addr_ip6_mapped_ipv4 ... ok test ctx::transport::tests::same_addr_ipv6 ... ok test ctx::transport::tests::same_addr_ipv4 ... ok test ctx::transport::tests::same_addr_ip6_compat_ipv4 ... ok test inbound::tests::recognize_default_no_loop ... ok test telemetry::tap::match_::tests::http_from_proto ... ok test inbound::tests::recognize_default_no_orig_dst ... ok test inbound::tests::recognize_default_no_ctx ... ok test transparency::tcp::tests::duplex_doesnt_hang_when_one_half_finishes ... ok test telemetry::tap::match_::tests::tcp_from_proto ... ok test inbound::tests::recognize_orig_dst ... ok test telemetry::tap::match_::tests::tcp_matches ... ok test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/conduit_proxy-74584a35ef749a60 running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/discovery-73cd0b65bd7a45ae running 16 tests test http1::absolute_uris::outbound_reconnects_if_controller_stream_ends ... ok test http1::outbound_reconnects_if_controller_stream_ends ... ok test http1::absolute_uris::outbound_asks_controller_without_orig_dst ... ok test http1::absolute_uris::outbound_uses_orig_dst_if_not_local_svc ... ok test http1::outbound_asks_controller_without_orig_dst ... ok test http1::absolute_uris::outbound_asks_controller_api ... ok test http1::outbound_asks_controller_api ... ok test http1::outbound_uses_orig_dst_if_not_local_svc ... ok test http2::outbound_reconnects_if_controller_stream_ends ... ok test http2::outbound_asks_controller_without_orig_dst ... ok test http2::outbound_asks_controller_api ... ok test http2::outbound_uses_orig_dst_if_not_local_svc ... ok server h1 error: invalid HTTP version specified ERROR 2018-03-26T20:56:50Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: frame with invalid size into 500 test outbound_updates_newer_services ... ok ERROR 2018-03-26T20:56:50Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http1::absolute_uris::outbound_times_out ... ok ERROR 2018-03-26T20:56:50Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http1::outbound_times_out ... ok ERROR 2018-03-26T20:56:50Z: conduit_proxy: turning operation timed out after Duration { secs: 0, nanos: 100000000 } into 500 test http2::outbound_times_out ... ok test result: ok. 16 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/telemetry-cb5bee2d2b94332c running 12 tests test metrics_endpoint_inbound_request_duration ... ok test metrics_endpoint_inbound_request_count ... ok test metrics_endpoint_outbound_request_count ... ok test metrics_endpoint_outbound_request_duration ... ok test telemetry_report_errors_are_ignored ... ok test metrics_have_no_double_commas ... ok test inbound_sends_telemetry ... ok test http1_inbound_sends_telemetry ... ok test inbound_aggregates_telemetry_over_several_requests ... ok test metrics_endpoint_inbound_response_latency ... ok test metrics_endpoint_outbound_response_latency ... ok test records_latency_statistics ... ok test result: ok. 12 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running target/debug/deps/transparency-9d14bf92d8ba3700 running 19 tests ERROR 2018-03-26T20:56:55Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: unexpected internal error encountered into 500 test http1_connect_not_supported ... ok test http11_upgrade_not_supported ... ok test http10_without_host ... ok test http11_absolute_uri_differs_from_host ... ok test http1_head_responses ... ok test http10_with_host ... ok test http1_bodyless_responses ... ok test http1_content_length_zero_is_preserved ... ok test http1_removes_connection_headers ... ok test http1_one_connection_per_host ... ok test http1_response_end_of_file ... ok test http1_requests_without_host_have_unique_connections ... ok test inbound_http1 ... ok test inbound_tcp ... ok test http1_requests_without_body_doesnt_add_transfer_encoding ... ok test outbound_tcp ... ok test tcp_with_no_orig_dst ... ok test tcp_connections_close_if_client_closes ... ok test outbound_http1 ... ok test result: ok. 19 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests conduit-proxy running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out ``` However, if I `cd` into the `proxy` directory (so that Cargo treats the `conduit-proxy` crate as the root project, rather than the workspace) and pass the `--no-default-features` flag, the flaky tests are skipped as expected: ``` ➜ (cd proxy && exec cargo test --no-default-features) Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs Running /Users/eliza/Code/go/src/github.com/runconduit/conduit/target/debug/deps/conduit_proxy-ac198a96228a056e running 13 tests test fully_qualified_authority::tests::test_normalized_authority ... ok test ctx::transport::tests::same_addr_ipv4 ... ok test ctx::transport::tests::same_addr_ip6_compat_ipv4 ... ok test ctx::transport::tests::same_addr_ipv6 ... ok test ctx::transport::tests::same_addr_ip6_mapped_ipv4 ... ok test telemetry::tap::match_::tests::tcp_from_proto ... ok test telemetry::tap::match_::tests::http_from_proto ... ok test transparency::tcp::tests::duplex_doesnt_hang_when_one_half_finishes ... ok test telemetry::tap::match_::tests::tcp_matches ... ok test inbound::tests::recognize_default_no_ctx ... ok test inbound::tests::recognize_default_no_loop ... ok test inbound::tests::recognize_default_no_orig_dst ... ok test inbound::tests::recognize_orig_dst ... ok test result: ok. 13 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running /Users/eliza/Code/go/src/github.com/runconduit/conduit/target/debug/deps/conduit_proxy-41e0f900f97e194b running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Running /Users/eliza/Code/go/src/github.com/runconduit/conduit/target/debug/deps/discovery-7ba7fe16345a347a running 16 tests test http1::absolute_uris::outbound_times_out ... ignored test http1::outbound_times_out ... ignored test http1::absolute_uris::outbound_reconnects_if_controller_stream_ends ... ok test http1::outbound_reconnects_if_controller_stream_ends ... ok test http1::absolute_uris::outbound_uses_orig_dst_if_not_local_svc ... ok test http1::outbound_uses_orig_dst_if_not_local_svc ... ok test http1::absolute_uris::outbound_asks_controller_without_orig_dst ... ok test http1::outbound_asks_controller_without_orig_dst ... ok test http1::outbound_asks_controller_api ... ok test http1::absolute_uris::outbound_asks_controller_api ... ok test http2::outbound_times_out ... ignored server h1 error: invalid HTTP version specified ERROR 2018-03-26T21:48:32Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: frame with invalid size into 500 test http2::outbound_reconnects_if_controller_stream_ends ... ok test http2::outbound_uses_orig_dst_if_not_local_svc ... ok test http2::outbound_asks_controller_api ... ok test http2::outbound_asks_controller_without_orig_dst ... ok test outbound_updates_newer_services ... ok test result: ok. 13 passed; 0 failed; 3 ignored; 0 measured; 0 filtered out Running /Users/eliza/Code/go/src/github.com/runconduit/conduit/target/debug/deps/telemetry-b0763b64edd8fc68 running 12 tests test metrics_endpoint_inbound_request_count ... ignored test metrics_endpoint_inbound_request_duration ... ignored test metrics_endpoint_inbound_response_latency ... ignored test metrics_endpoint_outbound_request_count ... ignored test metrics_endpoint_outbound_request_duration ... ignored test metrics_endpoint_outbound_response_latency ... ignored test records_latency_statistics ... ignored test telemetry_report_errors_are_ignored ... ok test metrics_have_no_double_commas ... ok test http1_inbound_sends_telemetry ... ok test inbound_sends_telemetry ... ok test inbound_aggregates_telemetry_over_several_requests ... ok test result: ok. 5 passed; 0 failed; 7 ignored; 0 measured; 0 filtered out Running /Users/eliza/Code/go/src/github.com/runconduit/conduit/target/debug/deps/transparency-300fd801daa85ccf running 19 tests ERROR 2018-03-26T21:48:32Z: conduit_proxy: turning Error caused by underlying HTTP/2 error: protocol error: unexpected internal error encountered into 500 test http1_connect_not_supported ... ok test http11_upgrade_not_supported ... ok test http10_without_host ... ok test http10_with_host ... ok test http11_absolute_uri_differs_from_host ... ok test http1_head_responses ... ok test http1_bodyless_responses ... ok test http1_removes_connection_headers ... ok test http1_content_length_zero_is_preserved ... ok test http1_one_connection_per_host ... ok test http1_response_end_of_file ... ok test http1_requests_without_body_doesnt_add_transfer_encoding ... ok test inbound_tcp ... ok test inbound_http1 ... ok test http1_requests_without_host_have_unique_connections ... ok test outbound_tcp ... ok test tcp_connections_close_if_client_closes ... ok test tcp_with_no_orig_dst ... ok test outbound_http1 ... ok test result: ok. 19 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out Doc-tests conduit-proxy running 0 tests test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out ``` I'm wrapping the `cd` and `cargo test` command in a subshell so that the CWD on Travis is still in the repo root when the command exits, but the return value from `cargo test` is propagated. Closes #625	2018-03-26 17:11:06 -07:00
Brian Smith	7247ffeee3	Proxy: Clarify destination test support code queue handling (#617 ) Use `VecDeqeue` to make the queue structure clear. Follow good practice by minimizing the amount of time the lock is held. Clarify how defaulting logic works. Signed-off-by: Brian Smith <brian@briansmith.org>	2018-03-26 10:45:05 -10:00
Oliver Gould	006360aa90	Skip flaky tests for #613 (#614 ) The metrics endpoint tests are flaky because there are no guarantees that the metrics pipeline has processed events before the metrics endpoint is read. This can cause CI to fail spuriously. Disable these tests from running in CI until #613 is resolved.	2018-03-25 14:26:14 -07:00
Oliver Gould	c5179ba10b	Remove references to `cli` images (#611 ) CI builds on master have been failing to publish `cli-bin` images because the `docker-push` script still refers to the `cli` image, though it was removed in `e7c4a9d4b9`. This change removes references to the `cli` image from all scripts.	2018-03-25 09:46:34 -07:00
Andrew Seigner	12c6531546	Update docker-compose environment to match prod (#609 ) The Prometheus config in the docker-compose environment had fallen behind the prod setup. This change updates the docker-compose environment in the following ways: - Prometheus config more closely matches prod, based on #583 - simulate-proxy labels matches prod, based on #605 - add Grafana container Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-23 17:00:39 -07:00
Andrew Seigner	291d8e97ab	Move injected data from env var to k8s labels (#605 ) The inject code detects the object it is being injected into, and writes self-identifying information into the CONDUIT_PROMETHEUS_LABELS environment variable, so that conduit-proxy may read this information and report it to Prometheus at collection time. This change puts the self-identifying information directly into Kubernetes labels, which Prometheus already collects, removing the need for conduit-proxy to be aware of this information. The resulting label in Prometheus is recorded in the form `k8s_deployment`. Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-23 16:11:34 -07:00
Eliza Weisman	9321932918	Add request_duration_ms metric and increment request_total on request end (#589 ) This PR adds the `request_duration_ms` metric to the Prometheus metrics exported by the proxy. It also modifies the `request_total` metric so that it is incremented when a request stream finishes, rather than when it opens, for consistency with how the `response_total` metric is generated. Making this change required modifying `telemetry::sensors::http` to generate a `StreamRequestEnd` event similar to the `StreamResponseEnd` event. This is done similarly to how sensors are added to response bodies, by generalizing the `ResponseBody` type into a `MeasuredBody` type that can wrap a request or response body. Since this changed the type of request bodies, it necessitated changing request types pretty much everywhere else in the proxy codebase in order to fix the resulting type errors, which is why the diff for this PR is so large. Closes #570	2018-03-22 15:27:34 -07:00
Andrew Seigner	fb1d6a5c66	Introduce Conduit Health dashboard (#591 ) In addition to dashboards display service health, we need a dashboard to display health of the Conduit service mesh itself. This change introduces a conduit-health dashboard. It currently only displays health metrics for the control plane components. Proxy health will come later. Fixes #502 Part of #420 Signed-off-by: Andrew Seigner <siggy@buoyant.io>	2018-03-22 15:16:03 -07:00
Alex Leong	d50550515e	Add the proxy pod owner as a Prometheus label (#448 ) Update the inject command to set a CONDUIT_PROMETHEUS_LABELS proxy environment variable with the name of the pod spec that the proxy is injected into. This will later be used as a label value when the proxy is exposing metrics. Fixes: #426 Signed-off-by: Alex Leong <alex@buoyant.io>	2018-03-22 15:10:51 -07:00
Eliza Weisman	3f10c80256	Fix double comma in outbound metrics (#601 ) Fixes #600 The proxy metrics endpoint has a bug where metrics recorded in the outbound direction can contain two commas in a row when no outbound label is present. This occurs because the code for formatting the outbound direction label mistakenly assumed that there would always be a destination pod owner label as well, but the proxy isn't currently aware of the destination's pod owner (waiting for #429). I've fixed this issue by moving the place where the comma is output from the `fmt::Display` impl for `RequestLabels` to the `fmt::Display` impl for `OutboudnLabels`. This way, the comma between the `direction` and `dst_` labels is only output when the `dst_` label is present. This bug made it to master since all of the proxy end-to-end tests for metrics only test the inbound router. I've rectified this issue by adding tests on the outbound router as well (which would fail against the current master due to the double comma bug). I've also added a test that asserts there are no double commas in exported metrics, to protect against regressions to this bug.	2018-03-22 14:17:10 -07:00

... 29 30 31 32 33 ...

1837 Commits All Branches Search

1837 Commits

All Branches