Commit Graph

163 Commits

Author SHA1 Message Date
Brian Smith 4fadfa2243
Don't manually install Docker in Travis CI. (#297)
Travis CI now installs Docker 17.09 or later, which is good enough for
us, so avoid installing Docker manually.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-08 08:31:00 -10:00
Jeff Haynie f721a0f800 Fixed mispelling in conduit inject args (#300) 2018-02-08 12:48:40 -05:00
Eliza Weisman 915f08ac4c
Store proxy latencies in a structure that matches controller histogram (#11)
The proxy currently stores latency values in an `OrderMap` and reports every observed latency value to the controller's telemetry API since the last report. The telemetry API then sends each individual value to Prometheus. This doesn't scale well when there are a large number of proxies making reports. 

I've modified the proxy to use a fixed-size histogram that matches the histogram buckets in Prometheus. Each report now includes an array indicating the histogram bounds, and each response scope contains a set of counts corresponding to each index in the bounds array, indicating the number of times a latency in that bucket was observed. The controller then reports the upper bound of each bucket to Prometheus, and can use the proxy's reported set of bucket bounds so that the observed values will be correct even if the bounds in the control plane are changed independently of those set in the proxy.

I've also modified `simulate-proxy` to generate the new report structure, and added tests in the proxy's telemetry test suite validating the new behaviour.
2018-02-07 18:02:59 -08:00
Kevin Lingerfelt fbb4e812f8
Change default version string from "unknown" to "latest" (#284)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-02-07 10:01:12 -08:00
Risha Mars ff15574a0d
MetricsTable: Consolidate latency, success, request metrics into one tab (#276)
* Consolidate latency, success, request metrics into one tab
on the SortableMetricsTable

- removes sparklines from the table
- makes tables sortable by default
- move pod table in DeploymentDetail to its own row

* remove request distribution column, reorder columns
2018-02-07 09:50:01 -08:00
Oliver Gould a2d537f5c4
Use a load-aware balancer (#251)
Currently, the conduit proxy uses a simplistic Round-Robin load
balancing algorithm. This strategy degrades severely when individual
endpoints exhibit abnormally high latency.

This change improves this situation somewhat by making the load balancer
aware of the number of outstanding requests to each endpoint. When nodes
exhibit high latency, they should tend to have more pending requests
than faster nodes; and the Power-of-Two-Choices node selector can be
used to distribute requests to lesser-loaded instances.

From the finagle guide:

    The algorithm randomly picks two nodes from the set of ready endpoints
    and selects the least loaded of the two. By repeatedly using this
    strategy, we can expect a manageable upper bound on the maximum load of
    any server.

    The maximum load variance between any two servers is bound by
    ln(ln(n))` where `n` is the number of servers in the cluster.

Signed-off-by: Oliver Gould <ver@buoyant.io>
2018-02-07 09:39:31 -08:00
Kevin Lingerfelt 447ee142c0
Stop running "cargo check" in CI (#285)
* Stop running "cargo check" in CI

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Attempt to clear cargo cache

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Remove cache clearing step

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-02-06 15:57:22 -08:00
Oliver Gould 95b91c5976
Set PROXY_SKIP_TESTS for CI Docker build (#283)
The SKIP_TESTS flag is not used. The PROXY_SKIP_TESTS flag should be set
so that unoptimized proxy tests are not built.
2018-02-06 13:37:38 -08:00
Oliver Gould 6a0936e699
Remove proxy/Dockerfile-deps (#279)
The current proxy Dockerfile configuration does not cache dependencies
well, which can increase build times substantially.

By carefully splitting proxy/Dockerfile into several stages that mock
parts of the project, dependencies may be built and cached in Docker
such that changes to the proxy only require building the conduit-proxy
crate.

Furthermore, proxy/Dockerfile now runs the proxy's tests before
producing an artifact, unless the ` PROXY_SKIP_TESTS` build-arg is set
and not-empty.

The `PROXY_UNOPTIMIZED` build-arg has been added to support quicker,
debug-friendly builds.
2018-02-06 13:01:38 -08:00
Risha Mars 185f48b086
DeploymentsList: Replace "least healthy" with "most active" deployments (#277)
* Replace Least Healthy Deployments section with Most Active Deployments (MAD)

* Fix old arguments to ConduitLink
2018-02-06 11:35:57 -08:00
Phil Calçado 8041d07e4d
Add --url option to skip browser when opening dashboard (#281)
Browser opening is a UX nicety, but it shouldn't be mandatory.
2018-02-06 14:07:55 -05:00
Phil Calçado 9c03764a29
Remove hardcoded port and shared state for http test (#282)
We now create a new test HTTP server per test case instead of sharing it across them all.

This should solve the data races we have experienced on Travis.

Signed-off-by: Phil Calcado <phil@buoyant.io>
2018-02-06 13:48:14 -05:00
Oliver Gould e2093e37f8
Move the Rust gRPC bindings to a dedicated crate (#275)
The proxy depends on `protoc`-generated gRPC bindings to communicate
with the controller. In order to generate these bindings, build-time
dependencies must be compiled.

In order to support a more granular, cacheable build scheme, a new crate
has been created to house these gRPC bindings,
`conduit-proxy-controller-grpc`.

Because `TryFrom` and `TryInto` conversions are implemented for
protobuf-defined types, the `convert` module also had to be moved to
into a dedicated crate.

Furthermore, because the proxy's tests require that
`quickcheck::Aribtrary` be implemented for protobuf types, the
`conduit-proxy-controller-grpc` crate supports an _arbitrary_ feature
fla protobuf types, the `conduit-proxy-controller-grpc` crate supports
an _arbitrary_ feature flag.

While we're moving these libraries around, the `tower-router` crate has
been moved to `proxy/router` and renamed to `conduit-proxy-router.`
`futures-mpsc-lossy` has been moved into the proxy directory but has not
been renamed.

Finally, the `proxy/Dockerfile-deps` image has been updated to avoid the
wasteful building of dependency artifacts, as they are not actually used
by `proxy/Dockerfile`.
2018-02-06 10:31:48 -08:00
Brian Smith c52600eb78
Check SHA-256 sum of dep binary before running it. (#272)
Previously we didn't verify that the downloaded dep binary is the right
binary.

Verify that the downloaded binary is correct.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-05 16:02:35 -10:00
Phil Calçado 5628d3c8f4
Add --short and --client to CLI version command (#274)
These are very useful when writing scripts, e.g.

conduit_version=`conduit version --short --client`

Signed-off-by: Phil Calcado <phil@buoyant.io>
2018-02-05 17:02:45 -05:00
Risha Mars c2da891be7
Minor UI title renames and other tweaks (#256)
* ServiceMesh: plot public-api instead of destination, retitle destination and telemetry graphs

* ResourceHealthOverview: Hide Inbound/Outbound request rate if there are 0 deployments

* ResourceMetricOverview: retitle DeploymentDetail/PodDetail sections
2018-02-05 11:27:31 -08:00
Brian Smith 704f00ae8f
Allow bin/dep wrapper script for dep to work on Windows. (#271)
Previously the script only worked on Linux and macOS.

Make it work on Windows too.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-05 09:24:18 -10:00
Brian Smith 4da0b57204
Always use the 64-bit version of dep. (#270)
The logic for choosing the 32-bit vs. 64-bit version of dep was
inverted.

Fix this by simply always using the 64-bit version.

Signed-off-by: Brian Smith <brian@briansmith.org>
2018-02-05 09:07:31 -10:00
Risha Mars 9887f10749
Add ability to change the time window for metrics fetching throughout the app (#237)
* Control metricsWindow from root of app

- Add buttons [currently hidden] on metrics pages to control window of metrics requests
- Consolidate metricsWindow usage (stop passing it around)
- Add a ConduitLink component so we can stop passing around pathPrefix
- Add tests for ApiHelpers

* Hide the time window buttons; fix bug in absolute links
* Add a note explaining why metricWindow buttons are disabled
* Convert ConduitLink in to a component that wraps another
2018-02-05 10:56:17 -08:00
Alex Leong b691c2e25b
Rename --version flag in conduit install to --conduit-version (#255)
This makes the `conduit install` flag match the `conduit inject` flag.

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-02-05 10:45:44 -08:00
Andrew Seigner 4156af786d
Enable race detection in ci (#259)
We previously did not have race detection enabled because our tests
would fail. Following #249, this is no longer the case.

Enable race detection in ci and build instructions. This change also
fixes client_test.go attempting to allocate a 2GB buffer due to bad test
input.

Fixes #173

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-02-02 15:04:52 -08:00
Risha Mars 9cadb30795
Fix reference to emojivoto voting deployment (#267) 2018-02-02 13:06:59 -08:00
Andrew Seigner 9a40d984ff
Replace shelling out with kubernetes proxy (#249)
The conduit dashboard command asychronously shells out and runs "kubectl
proxy".

This change replaces the shelling out with calls to kubernetes proxy
APIs. It also allows us to enable race detection in our go tests, as the
shell out code tests did not pass race detection.

Fixes #173

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-02-02 10:31:59 -08:00
Alex Leong fa2f5a0140
Add dep wrapper script to ensure consistent version of dep is used (#253)
* Add `bin/dep` which fetches a fixed version of `dep` to be used. 
* Upgrade from dep 0.3.1 to 0.4.1
* Fix inconsistent Gopkg.lock by checking in the result of `bin/dep ensure`

Signed-off-by: Alex Leong <alex@buoyant.io>
2018-02-01 16:09:05 -08:00
Oliver Gould e771f61298
Add a newline to dco.yml (#254) 2018-02-01 15:16:02 -08:00
Andrew Seigner 277c06cf1e
Simplify and refactor k8s labels and annnotations (#227)
The conduit.io/* k8s labels and annotations we're redundant in some
cases, and not flexible enough in others.

This change modifies the labels in the following ways:
`conduit.io/plane: control` => `conduit.io/controller-component: web`
`conduit.io/controller: conduit` => `conduit.io/controller-ns: conduit`
`conduit.io/plane: data` => (remove, redundant with `conduit.io/controller-ns`)
It also centralizes all k8s labels and annotations into
pkg/k8s/labels.go, and adds tests for the install command.

Part of #201

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-02-01 14:12:06 -08:00
Oliver Gould abaf498cfc
Do not require DCO signoff for project members (#252)
We only need the DCO bot to validate external submissions.
2018-02-01 13:45:32 -08:00
Kevin Lingerfelt 9ff439ef44
Add -log-level flag for install and inject commands (#239)
* Add -log-level flag for install and inject commands

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Turn off all CLI logging by default, rename inject and install flags

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Re-enable color logging

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-02-01 12:38:07 -08:00
Eliza Weisman eddc37de28 Adopt external tower-grpc and tower-h2 deps #225)
The conduit repo includes several library projects that have since been
moved into external repos, including `tower-grpc` and `tower-h2`.

This change removes these vendored libraries in favor of using the new
external crates.
2018-02-01 11:57:02 -08:00
Risha Mars a9d4a3d74e
Add more prometheus instrumentation (latency, response size) (#174)
We added basic prometheus instrumentation, but this only encapsulated basic go metrics and
 request counts. This adds latency and response size metrics exporting as well, to the 
public-api server, theweb server and the telemetry server.

Since the util function in grpc.go was basically used to wrap the server creation in a prometheus handler, I added the other prometheus constants in there and renamed the file to prometheus.go. 

- Add request duration and response size instrumentation to web and public api
- Also add latency monitoring to telemetry service requests
- Rename util/grpc.go to util/prometheus.go
2018-02-01 09:50:31 -08:00
Risha Mars f3925a07fb
Various small UI tweaks (#234)
* Various small UI naming tweaks

- align top two tables in the service mesh page
- "All Deployments" -> "Deployments"
- reorder latency p50, p95, p99
- "Current success" -> "Success rate"

* Add margin to incomplete mesh message, reorder latency in TabbedMetricsTable
* Right align numbers in service mesh page
2018-01-31 18:09:15 -08:00
Risha Mars 2c8a0f7563
Use dots reporter instead of nyan reporter in CI (#238)
Signed-off-by: Risha Mars <mars@buoyant.io>
2018-01-31 18:07:19 -08:00
Dennis Adjei-Baah 01312f9ffe
Prepare for v0.2.0 release (#248)
* prepare for v0.2.0 release

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
2018-01-31 15:39:48 -08:00
Andrew Seigner 3856b38550
README update to 2018 (#233)
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-01-30 10:47:19 -08:00
Sean McArthur 9720a32de7
proxy: fix tcp_with_no_orig_dst test (#229)
Sometimes, the try_read will return a connection error, sometimes it
will just return EOF. Handle both cases.

Closes #226
2018-01-29 15:15:06 -08:00
Kevin Lingerfelt 4a76c6448b
Update cli subcommands to print errors when encountered (#221)
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-01-29 11:28:19 -08:00
Sean McArthur 5b53123ec2
readme: update to say TCP, HTTP/1, and HTTP/2 are supported (#203) 2018-01-26 13:54:01 -08:00
Kevin Lingerfelt 7399df83f1
Set conduit version to match conduit docker tags (#208)
* Set conduit version to match conduit docker tags

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Remove --skip-inbound-ports for emojivoto

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Rename git_sha => git_sha_head

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Switch to using the go linker for setting the version

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Log conduit version when go servers start

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Cleanup conduit script

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Add --short flag to head sha command

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>

* Set CONDUIT_VERSION in docker-compose env

Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
2018-01-26 11:43:45 -08:00
Risha Mars 8f63c2b7a5
Fix link to deployments from autocomplete not including pathPrefix (#209)
Signed-off-by: Risha Mars <mars@buoyant.io>
2018-01-25 16:47:39 -08:00
Sean McArthur b861318e86
proxy: fix h1 streams to trigger response end events
Response End events were only triggered after polling the trailers of
a response, but when the Response is given to a hyper h1 server, it
doesn't know about trailers, so they were never polled!

The fix is that the `BodyStream` glue will now poll the wrapped body for
trailers after it sees the end of the data, before telling hyper the
stream is over. This ensures a ResponseEnd event is emitted.

Includes a proxy telemetry test over h1 connections.
2018-01-25 16:36:16 -08:00
Andrew Seigner aa17e37ab5
Add docker deps validation to ci (#207)
If docker image tags were out of date, ci would not fail until the
docker-deploy stage (master merge).

Modify ci to validate tags as part of the default ci run.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-01-25 12:41:02 -08:00
Andrew Seigner 4e2eb18f1d
Use cargo frozen flag in build scripts (#206)
The cargo commands in our docker and ci scripts were at risk for
modifying Cargo.lock and cache.

Using cargo's --frozen flag (and --locked during fetch) ensures our
build is consistent with what's defined across Cargo.toml, Cargo.lock,
and cached build artifacts.

Signed-off-by: Andrew Seigner <siggy@buoyant.io>
2018-01-25 11:08:15 -08:00
Phil Calçado 9410da471a
Better error handling for Tap (#177)
Previously, running `$conduit tap` would return a `Unexpected EOF` error when the server wasn't available. This was due to a few problems with the way we were handling errors all the way down the tap server. This change fixes that and cleans some of the protobuf-over-HTTP code.

- first step towards #49
- closes #106
2018-01-25 11:49:38 -05:00
Andrew Seigner d0a0bb22bd
Move EosCtx to common for Tap and Telemetery (#204)
* Make Eos optional in TapEvent

grpc_status not being set in protobuf is the same as being set to zero,
which is also status OK

Modify TapEvent to include an optional EOS struct

Signed-off-by: Andrew Seigner <siggy@buoyant.io>

Part of #198

* Add Eos to proto & proxy tap end-of-stream events

The proxy now outputs `Eos` instead of `grpc_status` in all end-of-stream tap events. The EOS value is set to `grpc_status_code` when the response ended with a `grpc_status` trailer, `http_reset_code` when the response ended with a reset, and no `Eos` when the response ended gracefully without a `grpc_status` trailer.

This PR updates the proxy. The proto and controller changes are in PR #204.
Part of #198. Closes #202

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-01-24 15:48:00 -08:00
Risha Mars d0119162e8
Switch to ant sider/content Layout modules (#188)
* Switch to ant sider/content Layout modules, to help style sidebar
This fixes the problem of the sidebar not extending all the way on long pages.
* Fix a bug where the autocomplete options weren't being reset when an item was selected
2018-01-24 11:38:54 -08:00
Eliza Weisman 9e49054963
Classify non-gRPC status codes for HTTP telemetry (#200)
Currently, all "success"/"failure" classifications in the telemetry API are made based on the `grpc-status` trailer. If the trailer is not present, then a request is assumed to have failed. As we start proxying non-gRPC traffic, the controller needs to also be aware of HTTP status codes, so that non-gRPC requests are not assumed to always fail.

I've modified the telemetry API server to classify requests based on their HTTP status codes when the `grpc-status` trailer is not present. 

I've also modified the `simulate-proxy` script to generate fake HTTP/2 traffic without the `grpc-status` trailer.

Closes #196

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
2018-01-24 10:57:23 -08:00
Sean McArthur 54aef56e25
proxy: add transparent protocol detection and handling
The proxy will now try to detect what protocol new connections are
using, and route them accordingly. Specifically:

- HTTP/2 stays the same.
- HTTP/1 is now accepted, and will try to send an HTTP/1 request
  to the target.
- If neither HTTP/1 nor 2, assume a TCP stream and simply forward
  between the source and destination.

* tower-h2: fix Server Clone bounds
* proxy: implement Async{Read,Write} extra methods for Connection

Closes #130 
Closes #131
2018-01-23 16:14:07 -08:00
Andrew Seigner 47ec2fb190
Remove DOCKER_FORCE_BUILD, disable symbolic tags (#168)
DOCKER_FORCE_BUILD, combined with symbolic tags, added complexity and
risk of running unintended versions of the code.

This change removes DOCKER_FORCE_BUILD, and sets all Docker tags
programmatically. The decision to pull or build has been moved up the
stack from _docker.sh to the docker-build-* scripts. Workflows that
want to favor docker pulls (like ci), can do so explicitly via
docker-pull.

fixes #141

Signed-off-by: Andrew Seigner <andrew@sig.gy>
2018-01-23 12:02:28 -08:00
Risha Mars b9f5ad093f
Rename js components to clarify component relationships (#179)
* Rename components to clarify component relationships
* Rename Deployment to DeploymentDetail to match PodDetail
* Rename Deployments to DeploymentsList to clarify which page this is
* Rename StatPane to ResourceMetricsOverview to be a less generic name
* Rename HealthPane -> ResourceHealthOverview
* Rename StatPaneStat -> ResourceOverviewMetric

Signed-off-by: Risha Mars <mars@buoyant.io>
2018-01-23 10:05:53 -08:00
Sean McArthur db913e3d18
controller: echo ip address if destination service receives ip (#186)
Signed-off-by: Sean McArthur <sean@seanmonstar.com>
2018-01-22 16:20:13 -08:00