Commit Graph

17 Commits

Author SHA1 Message Date
Aaron Gable ab79f96d7b
Fixup staticcheck and stylecheck, and violations thereof (#5897)
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.

Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.

Part of #5681
2022-01-20 16:22:30 -08:00
Jacob Hoffman-Andrews 2d2c723d34
Break the chain of cancellations at the SA (#5459)
A recent mysql driver upgrade caused a performance regression. We
believe this may be due to cancellations getting passed through to the
database driver, which as of the upgrade will more aggressively tear
down connections that experienced a cancellation.

Also, we only recently started propagation cancellations all the way
from the frontend in #5404.

This makes it so the driver doesn't see the cancellation.

Second attempt at #5447
2021-06-24 16:49:32 -07:00
Aaron Gable 229377aabc
Simplify gRPC interceptors (#5435)
Use the built-in grpc-go client and server interceptor chaining
utilities, instead of the ones provided by go-grpc-middleware.
Simplify our interceptors to call their handlers/invokers directly,
instead of delegating to the metrics interceptor, and add the
metrics interceptor to the chains instead.
2021-05-26 10:19:11 -07:00
Aaron Gable 9abb39d4d6
Honeycomb integration proof-of-concept (#5408)
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).

Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.

Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
2021-05-24 16:13:08 -07:00
Jacob Hoffman-Andrews b4e483d38b
Add gRPC MaxConnectionAge config. (#5311)
This allows servers to tell clients to go away after some period of time, which triggers the clients to re-resolve DNS.

Per grpc/grpc#12295, this is the preferred way to do this.

Related: #5307.
2021-03-01 18:37:47 -08:00
Samantha 07aef67fa6
Refactoring tls.Config mutation out of grpc (#5175)
In all boulder services, we construct a single tls.Config object
and then pass it into multiple gRPC setup methods. In all boulder
services but one, we pass the object into multiple clients, and
just one server. In general, this is safe, because all of the client
setup happens on the main thread, and the server setup similarly
happens on the main thread before spinning off the gRPC server
goroutine.

In the CA, we do the above and pass the tlsConfig object into two
gRPC server setup functions. Thus the first server goroutine races
with the setup of the second server.

This change removes the post-hoc assignment of MinVersion,
MaxVersion, and CipherSuites of the tls.Config object passed
to grpc.ClientSetup and grpc.NewServer. And adds those same
values to the cmd.TLSConfig.Load, the method responsible for
constructing the tls.Config object before it's passed to
grpc.ClientSetup and grpc.NewServer.

Part of #5159
2020-11-12 16:24:16 -08:00
Roland Bracewell Shoemaker 3532dce246 Excise grpc maxConcurrentStreams configuration (#4257) 2019-06-12 09:35:24 -04:00
Roland Bracewell Shoemaker dbab48b488 Restrict gRPC TLS connections to 1.2 and ECDHE-RSA-CHACHA20-POLY1305 (#3903) 2018-10-24 16:07:12 -04:00
Daniel McCarney 0e07eacb01
gRPC: Rename histogram rpc_lag -> grpc_lag (#3673) 2018-04-26 16:19:13 -04:00
Daniel McCarney aa810a3142 gRPC: publish RPC latency stat in server interceptor. (#3665)
We may see RPCs that are dispatched by a client but do not arrive at the server for some time afterwards. To have insight into potential request latency at this layer we want to publish the time delta between when a client sent an RPC and when the server received it.

This PR updates the gRPC client interceptor to add the current time to the gRPC request metadata context when it dispatches an RPC. The server side interceptor is updated to pull the client request time out of the gRPC request metadata. Using this timestamp it can calculate the latency and publish it as an observation on a Prometheus histogram.

Accomplishing the above required wiring a clock through to each of the client interceptors. This caused a small diff across each of the gRPC aware boulder commands.

A small unit test is included in this PR that checks that a latency stat is published to the histogram after an RPC to a test ChillerServer is made. It's difficult to do more in-depth testing because using fake clocks makes the latency 0 and using real clocks requires finding a way to queue/delay requests inside of the gRPC mechanisms not exposed to Boulder.

Updates https://github.com/letsencrypt/boulder/issues/3635 - Still TODO: Explicitly logging latency in the VA, tracking outstanding RPCs as a gauge.
2018-04-25 15:37:22 -07:00
Jacob Hoffman-Andrews 2a1cd4981a Allow configuring gRPC's MaxConcurrentStreams (#3642)
During periods of peak load, some RPCs are significantly delayed (on the order of seconds) by client-side blocking. HTTP/2 clients have to obey a "max concurrent streams" setting sent by the server. In Go's HTTP/2 implementation, this value [defaults to 250](https://github.com/golang/net/blob/master/http2/server.go#L56), so the gRPC default is also 250. So whenever there are more than 250 requests in progress at a time, additional requests will be delayed until there is a slot available.

During this peak load, we aren't hitting limits on CPU or memory, so we should increase the max concurrent streams limit to take better advantage of our available resources. This PR adds a config field to do that.

Fixes #3641.
2018-04-12 17:17:17 -04:00
Jacob Hoffman-Andrews 68d5cc3331
Restore gRPC metrics (#3265)
The go-grpc-prometheus package by default registers its metrics with Prometheus' global registry. In #3167, when we stopped using the global registry, we accidentally lost our gRPC metrics. This change adds them back.

Specifically, it adds two convenience functions, one for clients and one for servers, that makes the necessary metrics object and registers it. We run these in the main function of each server.

I considered adding these as part of StatsAndLogging, but the corresponding ClientMetrics and ServerMetrics objects (defined by go-grpc-prometheus) need to be subsequently made available during construction of the gRPC clients and servers. We could add them as fields on Scope, but this seemed like a little too much tight coupling.

Also, update go-grpc-prometheus to get the necessary methods.

```
$ go test github.com/grpc-ecosystem/go-grpc-prometheus/...
ok      github.com/grpc-ecosystem/go-grpc-prometheus    0.069s
?       github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
```
2017-12-07 15:44:55 -08:00
Jacob Hoffman-Andrews d542960a35 Remove statsd version of RPC stats (#2693)
* Remove statsd-style RPC stats.

* Remove tests for old code.
2017-04-25 10:10:35 -04:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews b8a237ffb3 Use grpc-go-prometheus for RPC stats. (#2391)
There's an off-the-shelf package that provides most of the stats we care about
for gRPC using interceptors. This change vendors go-grpc-prometheus and its
dependencies, and calls out to the interceptors provided by that package from
our own interceptors.

This will allow us to get metrics like latency histograms by call, status codes
by call, and so on.

Fixes #2390.

This change vendors go-grpc-prometheus and its dependencies. Per contributing guidelines, I've run the tests on these dependencies, and they pass:

go test github.com/davecgh/go-spew/spew github.com/grpc-ecosystem/go-grpc-prometheus github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto github.com/pmezard/go-difflib/difflib github.com/stretchr/testify/assert github.com/stretchr/testify/require github.com/stretchr/testify/suite 
ok      github.com/davecgh/go-spew/spew 0.022s
ok      github.com/grpc-ecosystem/go-grpc-prometheus    0.120s
?       github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
ok      github.com/pmezard/go-difflib/difflib   0.042s
ok      github.com/stretchr/testify/assert      0.021s
ok      github.com/stretchr/testify/require     0.017s
ok      github.com/stretchr/testify/suite       0.012s
2016-12-05 14:31:22 -08:00
Daniel McCarney 6c983e8c9e Implements client whitelisting for gRPC. (#2307)
As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose.

This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names

The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field.

The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!).

An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate.

Resolves #2282
2016-11-08 13:57:34 -05:00
Jacob Hoffman-Andrews 332b019b99 Split grpc/util.go into client and server. (#2212)
Having files or packages named util is not great, because they wind up
attracting lots of small, unrelated functionality.
2016-09-29 10:53:17 -07:00