Kubernetes will do multiple DNS lookups for a name like
`proxy-api.conduit.svc.cluster.local` based on the default search settings
in /etc/resolv.conf for each container:
1. proxy-api.conduit.svc.cluster.local.conduit.svc.cluster.local. IN A
2. proxy-api.conduit.svc.cluster.local.svc.cluster.local. IN A
3. proxy-api.conduit.svc.cluster.local.cluster.local. IN A
4. proxy-api.conduit.svc.cluster.local. IN A
We do not need or want this search to be done, so avoid it by making each
name absolute by appending a period so that the first three DNS queries
are skipped for each name.
The case for `localhost` is even worse because we expect that `localhost` will
always resolve to 127.0.0.1 and/or ::1, but this is not guaranteed if the default
search is done:
1. localhost.conduit.svc.cluster.local. IN A
2. localhost.svc.cluster.local. IN A
3. localhost.cluster.local. IN A
4. localhost. IN A
Avoid these unnecessary DNS queries by making each name absolute, so that the
first three DNS queries are skipped for each name.
Signed-off-by: Brian Smith <brian@briansmith.org>
This PR ensures that the mapping of requests to outbound connections is segregated by `Host:` header values. In most cases, the desired behavior is provided by Hyper's connection pooling. However, Hyper does not handle the case where a request had no `Host:` header and the request URI had no authority part, and the request was routed based on the SO_ORIGINAL_DST in the desired manner. We would like these requests to each have their own outbound connection, but Hyper will reuse the same connection for such requests.
Therefore, I have modified `conduit_proxy_router::Recognize` to allow implementations of `Recognize` to indicate whether the service for a given key can be cached, and to only cache the service when it is marked as cachable. I've also changed the `reconstruct_uri` function, which rewrites HTTP/1 requests, to mark when a request had no authority and no `Host:` header, and the authority was rewritten to be the request's ORIGINAL_DST. When this is the case, the `Recognize` implementations for `Inbound` and `Outbound` will mark these requests as non-cachable.
I've also added unit tests ensuring that A, connections are created per `Host:` header, and B, that requests with no `Host:` header each create a new connection. The first test passes without any additional changes, but the second only passes on this branch. The tests were added in PR #489, but this branch supersedes that branch.
Fixes#415. Closes#489.
Grafana by default calls out to grafana.com to check for updates. As
user's of Conduit do not have direct control over updating Grafana
directly, this update check is not needed.
Disable Grafana's update check via grafana.ini.
This is also a workaround for #155, root cause of #519.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
The main goal with this change is to make it clear, from looking at
Cargo.lock, that the Conduit proxy doesn't depend on OpenSSL. Although
the *ring* crate attempts to avoid conflicts with symbols defined in
OpenSSL, that is a manual process that doesn't have automatic
verification yet.
The secondary goal is to reduce the total number of dependencies to
make (at least) full from-scratch builds, such as those in CI, faster.
As a result of this PR, following the following upstream PRs we
submitted to prost, as well as some similar PRs in other upstream
projects and in conduit inself, our usage of prost now results in us
depending on many fewer crates:
* https://github.com/danburkert/prost/pull/78
* https://github.com/danburkert/prost/pull/79
* https://github.com/danburkert/prost/pull/82
* https://github.com/danburkert/prost/pull/84
Here are the crate dependencies that are removed:
* adler32
* aho-corasick
* build_const
* bzip2
* bzip2-sys
* crc
* curl
* curl-sys
* env_logger (0.4)
* flate2
* lazy_static
* libz-sys
* memchr
* miniz_oxide
* miniz_oxide_c_api
* msdos_time
* openssl-probe
* openssl-sys
* pkg-config
* podio
* regex
* regex-syntax
* schannel
* socket2
* thread_local
* unreachable
* utf8-ranges
* vcpkg
* zip
Pretty much all of these are build dependencies, but Cargo.lock doesn't
distinguish between build dependencies and regular dependencies.
Signed-off-by: Brian Smith <brian@briansmith.org>
* Add --expected-version flag for conduit check command
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Update build instructions
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
As a goal of being a transparent proxy, we want to proxy requests and
responses with as little modification as possible. Basically, servers
and clients should see messages that look the same whether the proxy was
injected or not.
With that goal in mind, we want to make sure that body headers (things
like `Content-Length`, `Transfer-Encoding`, etc) are left alone. Prior
to this commit, we at times were changing behavior. Sometimes
`Transfer-Encoding` was added to requests, or `Content-Length: 0` may
have been removed. While RC 7230 defines that differences are
semantically the same, implementations may not handle them correctly.
Now, we've added some fixes to prevent any of these header changes
from occurring, along with tests to make sure library updates don't
regress.
For requests:
- With no message body, `Transfer-Encoding: chunked` should no longer be
added.
- With `Content-Length: 0`, the header is forwarded untouched.
For responses:
- Tests were added that responses not allowed to have bodies (to HEAD
requests, 204, 304) did not have `Transfer-Encoding` added.
- Tests that `Content-Length: 0` is preserved.
- Tests that HTTP/1.0 responses with no body headers do not have
`Transfer-Encoding` added.
- Tests that `HEAD` responses forward `Content-Length` headers (but not
an actual body).
Closes#447
Signed-off-by: Sean McArthur <sean@seanmonstar.com>
When the conduit proxy is injected into the controller pod, we observe controller pod proxy stats show up as an "outbound" deployment for an unrelated upstream deployment. This may cause confusion when monitoring deployments in the service mesh.
This PR filters out this "misleading" stat in the public api whenever the dashboard requests metric information for a specific deployment.
* exclude telemetry generated by the control plane when requesting deployment metrics
fixes#370
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
The dns_test had assumed DNS changes were deterministically ordered, but
util.DiffAddresses uses a map and therefore does not guarantee ordering.
Fix dns_test to sort TCP Addresses prior to comparison.
Fixes#515
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Grafana dashboards will not be available for the 0.3.1 release, but
BUILD.md provides an (incorrect) way to access Grafana.
Remove mention of Grafana for now. Re-add when dashboards are integrated
into Conduit.
Part of #420.
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Currently, the `Reconnect` middleware does not reconnect on connection errors (see #491) and treats them as request errors. This means that when a connection timeout is wrapped in a `Reconnect`, timeout errors are treated as request errors, and the request returns HTTP 500. Since this is not the desired behavior, the connection timeouts should be removed, at least until their errors can be handled differently.
This PR removes the connect timeouts from `Bind`, as described in https://github.com/runconduit/conduit/pull/483#issuecomment-369380003.
It removes the `CONDUIT_PROXY_PUBLIC_CONNECT_TIMEOUT_MS` environment variable, but _not_ the `CONDUIT_PROXY_PRIVATE_CONNECT_TIMEOUT_MS` variable, since this is also used for the TCP connect timeouts. If we want also want to remove the TCP connection timeouts, I can do that as well.
Closes#483. Fixes#491.
In the initial review for this code (preceding the creation of the
runconduit/conduit repository), it was noted that this container is not
actually used, so this is actually dead code.
Further, this container actualy causes a minor problem, as it doesn't
implement any retry logic, thus it will sometimes often cause errors to
be logged. See
https://github.com/runconduit/conduit/issues/496#issuecomment-370105328.
Further, this is a "buoyantio/" branded container. IF we actually need
such a container then it should be a Conduit-branded container.
See https://github.com/runconduit/conduit/issues/478 for additional
context.
Signed-off-by: Brian Smith <brian@briansmith.org>
This PR changes the proxy to log error messages using `fmt::Display` whenever possible, which should lead to much more readable and meaningful error messages
This is part of the work I started last week on issue #442. While I haven't finished everything for that issue (all errors still are mapped to HTTP 500 error codes), I wanted to go ahead and open a PR for the more readable error messages. This is partially because I found myself merging these changes into other branches to aid in debugging, and because I figured we may as well have the nicer logging on master.
We previously `join`ed on piping data from both sides, meaning
that the future didn't complete until **both** sides had disconnected.
Even if the client disconnected, it was possible the server never knew,
and we "leaked" this future.
To fix this, the `join` is replaced with a `Duplex` future, which pipes
from both ends into the other, while also detecting when one side shuts
down. When a side does shutdown, a write shutdown is forwarded to the
other side, to allow draining to occur for deployments that half-close
sockets.
Closes#434
The Markdown files were all originally named "$x/_index.md"; I renamed
them as follows:
```
for x in `ls ~/conduit-site/conduit.io/content`; do
cp ~/conduit-site/conduit.io/content/$x/_index.md doc/$x.md
done
mv doc/doc.md doc/overview.md
```
When we publish the files on conduit.io we need to do the inverse
transformation to avoid breaking existing links.
The images were embedded using a syntax GitHub doesn't support. Also, the
images were not originally in a subdirectory of docs/.
Use normal Markdown syntax for image embedding, and reference the docs
using relative links to the images/ subdirectory. This way they will show
up in the GitHub UI. When we publish the docs on conduit.io we'll need to
figure out how to deal with this change.
I took the liberty of renaming data-plane.png to dashboard-data-plane.png to
clarify it a bit.
There is no other roadmap so there's no need to qualify this one as
"public." Before it was made public we marked it "public" to emphasize
that it would become public, but that isn't needed now.
Signed-off-by: Brian Smith <brian@briansmith.org>
BUILD.md included a command with an invalid environment variable,
which prevented the proxy from starting.
The IP address `0` is no longer considered valid by the proxy, so the
doc now refers to `0.0.0.0` instead.
Signed-off-by: Ray Tung <rtung@thoughtworks.com>
Kubernetes $KUBECONFIG environment variable is a list of paths to
configuration files, but conduit assumes that it is a single path.
Changes in this commit introduce a straightforward way to discover and
load config file(s).
Complete list of changes:
- Use k8s.io/client-go/tools/clientcmd to deal with kubernetes
configuration file
- rename k8s API and k8s proxy constructors to get rid of redundancy
- remove shell package as it is not needed anymore
Signed-off-by: Igor Zibarev <zibarev.i@gmail.com>
`conduit install` deploys prometheus, but lacks a general-purpose way to
visualize that data.
This change adds a Grafana container to the `conduit install` command. It
includes two sample dashboards, viz and health, in their own respective
source files.
Part of #420
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
As requested by @briansmith in https://github.com/runconduit/conduit/issues/415#issuecomment-369026560 and https://github.com/runconduit/conduit/issues/415#issuecomment-369032059, I've refactored `FullyQualifiedAuthority::normalize` to _always_ return a `FullyQualifiedAuthority`, along with a boolean value indicating whether or not the Destination service should be used for that authority.
This is in contrast to returning an `Option<FullyQualifiedAuthority>` where `None` indicated that the Destination service should not be used, which is what this function did previously.
This is required for further progress on #415.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
As part of `conduit check` command, warn the user if they are
running an outdated version of the cli client or the control
plane components.
Fixes#314
Signed-off-by: Andy Hume <andyhume@gmail.com>
When attempting to tap N pods when N is greater than the target rps, a rounding error occurs that requests 0 rps from each pod and no tap data is returned.
Ensure that tap requests at least 1 rps from each target pod.
Tested in Kubernetes on docker-for-desktop with a 15 replica deployment and a maxRps of 10.
Signed-off-by: Alex Leong <alex@buoyant.io>
```
cargo update -p tempdir
cargo update -p abstract-ns
```
The new version of tempdir actually adds a new dependency, but
apparently that is to fix a bug.
Signed-off-by: Brian Smith <brian@briansmith.org>
Currently we have to download and build two different versions of
the ordermap crate.
I will submit similar PRs for the dependent crates so that we will
eventually all be using the same version of indexmap.
Signed-off-by: Brian Smith <brian@briansmith.org>
This was caused by the fact that a new instance of `env_logger::init()`
was added after the PR that rewrote them all to `env_logger::try_init()`
was added.
Fixes#469
Signed-off-by: Brian Smith <brian@briansmith.org>
Stop initializing env_logger in every test. In env_logger 0.5, it
may only be initialized once per process.
Also, Prost will soon upgrade to env_logger 0.5 and this will
(eventually) help reduce the number of versions of env_logger we
have to build. Turning off the regex feature will (eventually) also
reduce the number of dependencies we have to build. Unfortunately,
as it is now, the number of dependencies has increased because
env_logger increased its dependencies in 0.5.
Signed-off-by: Brian Smith <brian@briansmith.org>
Turning off the default features of quickcheck removes its
`env_logger` and `log` dependencies. It uses older versions of
those packages than conduit-proxy will use, so this will
(eventually) reduce the number of versions of those packages that
get downloaded and built.
Signed-off-by: Brian Smith <brian@briansmith.org>
Hyper depends on tokio-proto with a default feature. By turning off
its default features, we can avoid that dependency. That reduces the
number of dependencies by 4.
Signed-off-by: Brian Smith <brian@briansmith.org>
Version 1.7.0 of the url crate seems to be broken which means we cannot
`cargo update` the proxy without locking url to version 1.6. Since we only
use it in a very limited way anyway, and since we use http::uri for parsing
much more, just switch all uses of the url crate to use http::uri for parsing
instead.
This eliminates some build dependencies.
Signed-off-by: Brian Smith <brian@briansmith.org>
Closes#403.
When the Destination service does not return a result for a service, the proxy connection for that service will hang indefinitely waiting for a result from Destination. If, for example, the requested name doesn't exist, this means that the proxy will wait forever, rather than responding with an error.
I've added a timeout wrapping the service returned from `<Outbound as Recognize>::bind_service`. The timeout can be configured by setting the `CONDUIT_PROXY_BIND_TIMEOUT` environment variable, and defaults to 10 seconds (because that's the default value for [a similar configuration in Linkerd](https://linkerd.io/config/1.3.5/linkerd/index.html#router-parameters)).
Testing with @klingerf's reproduction from #403:
```
curl -sIH 'Host: httpbin.org' $(minikube service proxy-http --url)/get | head -n1
HTTP/1.1 500 Internal Server Error
```
proxy logs:
```rust
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy using controller at HostAndPort { host: Domain("proxy-api.conduit.svc.cluster.local"), port: 8086 }
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy routing on V4(127.0.0.1:4140)
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy proxying on V4(0.0.0.0:4143) to None
proxy-5698f79b66-8rczl conduit-proxy INFO conduit_proxy::transport::connect "controller-client", DNS resolved proxy-api.conduit.svc.cluster.local to 10.0.0.240
proxy-5698f79b66-8rczl conduit-proxy ERR! conduit_proxy::map_err turning service error into 500: Inner(Timeout(Duration { secs: 10, nanos: 0 }))
```
This PR adds a `flaky_tests` cargo feature to control whether or not to ignore tests that are timing-dependent. This feature is enabled by default in local builds, but disabled on CI and in all Docker builds.
Closes#440
The telemetry service in the controller pod uses a non-fully qualified URL to connect to the prometheus pod in the control plane. This PR changes the URL the telemetry's prometheus URL to be fully qualified to be consistent with other URLs in the control plane. This change was tested in minikube. The logs report no errors and looking at the prometheus dashboard shows that stats are being recorded from all conduit proxies.
fixes#414
Signed-off-by: Dennis Adjei-Baah dennis@buoyant.io
Refactor `conduit install` test into a data-driven test. Then add a test of the actual default output of `conduit install`.
This test is useful to make it clear when we change the default
settings of `conduit install`.
Signed-off-by: Brian Smith <brian@briansmith.org>
The control plane is proxied through the Conduit proxy. The Conduit
proxy is based on the base image, and the control plane containers
and the proxy share a networking namespace. This means we don't
need the extra base utilities in the controller images since we can
use the utilties in the proxy image.
This is a step towards building the initial no-networking Conduit CA
pod. Since the Conduit CA will not do any networking of its own, we
networking debugging utilties are not helpful for it. They are
actually an unnecessary risk because they could facilitate the
exfiltration of the private key of the CA. (The Conduit CA pod won't
have the Conduit Proxy injected into it either.)
This also simplifies & slightly speeds up the building of the
controller images. This is a stepping stone towards being able to
build the controller images without `docker build` to improve build
times.
Signed-off-by: Brian Smith <brian@briansmith.org>
The `app` label should be reserved for end-user applications and we
shouldn't use it ourselves. We already have a Conduit-specific label
that is is prefixed with the `conduit.io/` prefix to avoid naming
collisions with users' labels, so just use that one instead.
Signed-off-by: Brian Smith <brian@briansmith.org>
The template used by `conduit install` was hard-coded in install.go.
This change moves the template into its own file, in anticipation of
increasing the template's size and complexity.
Part of #420
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
In order to take advantage of the benefits the conduit proxy gives to deployments, this PR injects the conduit proxy into the control plane pod. This helps us lay the groundwork for future work such as TLS, control plane observability etc.
Fixes#311
Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
The build scripts assume they are executed from the root of this repo.
This prevents running scripts from other locations, for example,
`cd web && ../bin/go-run .`.
Modify the build scripts to work regardless of current directory.
Fixes#301
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
* Use Go 1.10.0 to build Go components.
Take advantage of the new build cache in Go 1.10. Future work on improving
build performance will utilize the build cache further.
Signed-off-by: Brian Smith <brian@briansmith.org>
Previously Dockerfile-go-deps was converted from a multi-stage Dockefile
to a single-stage Dockerfile in anticipation of enabling efficient use
of `--cache-from` in CI. However, that resulted in the image ballooning
in size because it contained the Git repo for every package downloaded
by `dep ensure`.
Bring the image back down to the proper size by removing the temporary
files created.
Signed-off-by: Brian Smith <brian@briansmith.org>