Problem
`conduit stat` would cause a panic for any resource that wasn't in the list
of StatAllResourceTypes
This bug was introduced by https://github.com/runconduit/conduit/pull/1088/files
Solution
Fix writeStatsToBuffer to not depend on what resources are in StatAllResourceTypes
Also adds a unit test and integration test for `conduit stat ns`
* dest service: close open streams on shutdown
* Log instead of print in pkg packages
* Convert ServerClose to a receive-only channel
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
The comments in Outbound::recognize had become somewhat stale as the
logic changed. Furthermore, this implementation may be easier to
understand if broken into smaller pieces.
This change reorganizes the Outbound:recognize method into helper
methods--`destination`, `host_port`, and `normalize`--each with
accompanying docstrings that more accurately reflect the current
implementation.
This also has the side-effect benefit of eliminating a string clone on
every request.
- If error messages are very long, truncate them and display a toggle to show the full message
- Tweak the headings - remove Pod, Container and Image - instead show them as titles
- Also move over from using Ant's Modal.method to the plain Modal component, which is a
little simpler to hook into our other renders.
Depends on #1047.
This PR adds a `tls="true"` label to metrics produced by TLS connections and
requests/responses on those connections, and a `tls="no_config"` label on
connections where TLS was enabled but the proxy has not been able to load
a valid TLS configuration.
Currently, these labels are only set on accepted connections, as we are not yet
opening encrypted connections, but I wired through the `tls_status` field on
the `Client` transport context as well, so when we start opening client
connections with TLS, the label will be applied to their metrics as well.
Closes#1046
Signed-off-by: Eliza Weisman <eliza@buoyanbt.io>
* Proxy: Make TLS server aware of its own identity.
When validating the TLS configuration, make sure the certificate is
valid for the current pod. Make the pod's identity available at that
point in time so it can do so. Since the identity is available now,
simplify the validation of our own certificate by using Rustls's API
instead of dropping down to the lower-level webpli API.
This is a step towards the server differentiating between TLS
handshakes it is supposed to terminate vs. TLS handshakes it is
supposed to pass through.
This is also a step toward the client side (connect) of TLS, which will
reuse much of the configuration logic.
Signed-off-by: Brian Smith <brian@briansmith.org>
Remove the filling and stacking in request rate graphs that combine resources,
to make it easier to spot outliers.
* Grafana: remove fill and stack from individual resource breakouts
* Remove all the stacks and fills from request rates everywhere
The `kubernetes-nodes-cadvisor` Prometheus queries node-level data via
the Kubernetes API server. In some configurations of Kubernetes, namely
minikube and at least one baremetal kubespray cluster, this API call
requires the `get` verb on the `nodes/proxy` resource.
Enable `get` for `nodes/proxy` for the `conduit-prometheus` service
account.
Fixes#912
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Using MS Edge and probably other clients with the Conduit proxy when
TLS is enabled fails because Rustls doesn't take into consideration
that Conduit only supports one signature scheme (ECDSA P-256 SHA-256).
This bug was fixed in Rustls when ECDSA support was added, after the
latest release. With this change MS Edge can talk to Conduit.
Signed-off-by: Brian Smith <brian@briansmith.org>
Previously, the proxy would not attempt to load its TLS certificates until a fs
watch detected that one of them had changed. This means that if the proxy was
started with valid files already at the configured paths, it would not load
them until one of the files changed.
This branch fixes that issue by starting the stream of changes with one event
_followed_ by any additional changes detected by watching the filesystem.
I've manually tested that this fixes the issue, both on Linux and on macOS, and
can confirm that this fixes the issue. In addition, when I start writing
integration tests for certificate reloading, I'll make sure to include a test
to detect any regressions.
Closes#1133.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Refactor the way the TLS trust anchors are configured in preparation
for the client and server authenticating each others' certificates.
Make the use of client certificates optional pending the implementation
of authorization policy.
Signed-off-by: Brian Smith <brian@briansmith.org>
When a TLS handshake error occurs, the proxy just stops accepting
requests. It seems my expectations of how `Stream` handles errors
were wrong.
The test for this will be added in a separate PR after the
infrastructure needed for TLS testing is added. (This is a chicken
and egg problem.)
Signed-off-by: Brian Smith <brian@briansmith.org>
* Start running integration tests in CI
* Add gcp helper funcs
* Split integration test cleanup into separate phase
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Display proxy container errors in the Web UI
Add an error modal to display pod errors
Add icon to data tables to indicate errors are present
Display errors on the Service Mesh Overview Page and all the resource pages
This PR changes the proxy's Inotify watch code to avoid always falling back to
polling the filesystem when the watched files don't exist yet. It also contains
some additional cleanup and refactoring of the inotify code, including moving
the non-TLS-specific filesystem watching code out of the `tls::config` module
and into a new `fs_watch` module.
In addition, it adds tests for both the polling-based and inotify-based watch
implementations, and changes the polling-based watches to hash the files rather
than using timestamps from the file's metadata to detect changes. These changes
are originally from #1094 and #1091, respectively, but they're included here
because @briansmith asked that all the changes be made in one PR.
Closes#1094. Closes#1091. Fixes#1090. Fixes#1097. Fixes#1061.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
prost-0.4.0 has been released, which removes unnecessary dependencies.
tower-grpc is being updated simultaneously, as this is the proxy's
primary use of prost.
See: https://github.com/danburkert/prost/releases/tag/v0.4.0
- It would be nice to display container errors in the UI. This PR gets the pod's container
statuses and returns them in the public api
- Also add a terminationMessagePolicy to conduit's inject so that we can capture the
proxy's error messages if it terminates
* proxy: Update `rand` to 0.5.1
The proxy depends on rand-0.4, which is superceded by newer APIs in
rand-0.5. Since we're already using rand-0.5 via the tower-balance
crate, it seems appropriate to upgrade the proxy.
* Expand lock files in reviews
protobuf has a `go_package` option that can be used to explicitly name
Go packages such that they can be imported without additional rewrites.
This allows us to store proto files without additional, redundant
directories (which were used for packaging hints, previously).
This change adds an explicit `go_package` to all .proto files and
updates `bin/protoc-go.sh` to ensure these packages are output into
$GOPATH (so that the go_package can be absolute). This removes the need
to manually rewrite imports in bin/protoc-go.sh.
* Add readiness/liveness checks for third party components
Any possible issues with the third party control plane components can wedge the services.
Take the best practices for prometheus/grafana and add them to our template. See #1116
* Update test fixtures for new output
Add an emitWarning to the webpack config so that webpack will compile despite lint
errors when running in development mode. This is necessary to enable development
on the frontend using webpack-dev-server's automatic reloading.
Also sets a NODE_ENV in travis.yml so that the build will fail if linting fails.
In e2093e3, we created a `convert` crate when refactoring the proxy's
gRPC bindings into a dedicated crate.
It's not really necessary to handle `convert` as a crate, given that it
holds a single 39-line file that's mostly comments. It's possible to
"vendor" this file in the proxy, and controller-grpc crate doesn't
even need this trait (in fact, the proxy probably doesn't either).
* Update destination service ot use shared informer instead of custom endpoints informer
* Add additional tests for dst svc endpoints watcher
* Remove service ports when all listeners unsubscribed
* Update go deps
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
* Update desintation service to use shared informer instead of pod watcher
* Add const for pod IP index name
Signed-off-by: Kevin Lingerfelt <kl@buoyant.io>
When developing features in the proxy, that rely on Linux-only OS features,
developers using other operating systems may find it inconvenient to test
their changes. While we run CI builds on Linux, and may have access to Linux
testing environments, this is not as tightly integrated into the proxy
development workflow as running a quick `cargo test` on the host OS.
For example, I found it inconvenient to test the `inotify` based filesystem
watch code I've been adding in recent commits, and had to do things like
opening a WIP PR for a branch to get CI to run the tests. This workflow is not
ideal.
This PR adds an (admittedly somewhat hacky) script and Dockerfile for running
the proxy's tests in Docker. This accomplishes approximately the same goal as
the `PROXY_SKIP_TESTS` flag that we used to have, but with the advantage that
we no longer include the test dependencies in release builds.
Of course, this also means that we no longer share any of the dependencies
between the test docker build and the release docker build, which is a shame.
It might be worthwhile to re-introduce a dependencies image so that cached
builds of the proxy's dependencies can be shared between the test and release
Dockerfiles. However, I thought that deserved to be discussed separately from
the changes I made in this branch.
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Common blacklists have `/api/stat` in them. This causes the dashboard to not load.
`/api/tps-reports` is not in any blacklists, suggests what this route does and is slightly tongue in cheek. Fixes#970
Previously, in conduit stat all we would just print the map of stat results, which
resulted in the order in which stats were displayed varying between prints.
Fix:
Define an array, k8s.StatAllResourceTypes and use the order in this array to print
the map; ensuring a consistent print order every time the command is run.