Instead of using `unwrapError/wrapError` in each of the wrapper functions do it in the server/client interceptors instead. This means we now consistently do error unwrapping/wrapping.
Fixes#2509.
Previously we had `Error` and `ValidationRecords` fields in the `Challenge` protobuf but they were never populated which mean't that when using gRPC these fields wouldn't be sent to the SA from the RA on a `FinalizeAuthorization` call. This change populates those fields and updates the PB marshaling tests to verify the correct behavior.
Fixes#2514.
Fixes#976.
This implements a new rate limit, InvalidAuthorizationsPerAccount. If a given account fails authorization for a given hostname too many times within the window, subsequent new-authz attempts for that account and hostname will fail early with a rateLimited error. This mitigates the misconfigured clients that constantly retry authorization even though they always fail (e.g., because the hostname no longer resolves).
For the new rate limit, I added a new SA RPC, CountInvalidAuthorizations. I chose to implement this only in gRPC, not in AMQP-RPC, so checking the rate limit is gated on gRPC. See #2406 for some description of the how and why. I also chose to directly use the gRPC interfaces rather than wrapping them in core.StorageAuthority, as a step towards what we will want to do once we've moved fully to gRPC.
Because authorizations don't have a created time, we need to look at the expires time instead. Invalid authorizations retain the expiration they were given when they were created as pending authorizations, so we use now + pendingAuthorizationLifetime as one side of the window for rate limiting, and look backwards from there. Note that this means you could maliciously bypass this rate limit by stacking up pending authorizations over time, then failing them all at once.
Similarly, since this limit is by (account, hostname) rather than just (hostname), you can bypass it by creating multiple accounts. It would be more natural and robust to limit by hostname, like our certificate limits. However, we currently only have two indexes on the authz table: the primary key, and
(`registrationID`,`identifier`,`status`,`expires`)
Since this limit is intended mainly to combat misconfigured clients, I think this is sufficient for now.
Corresponding PR for website: letsencrypt/website#125
Currently services will pass both `core.XXXError` and `probs.XXX` type errors across the gRPC layer. In the future (#2505) we intend to stop passing `probs.XXX` type errors across this layer but for now we need to support them until that change is landed. This patch takes the easiest path to allow this by encoding the `probs.ProblemDetails` to JSON and storing it in the gRPC error body so that it can be passed around.
Fixes#2497.
We turn arrays into maps with a range command. Previously, we were taking the
address of the iteration variable in that range command, which meant incorrect
results since the iteration variable gets reassigned.
Also change the integration test to catch this error.
Fixes#2496
Previously our gRPC client code called the wrong function, enabling server-side instead of client-side histograms.
Also, add a timing stat for the generate / store combination in OCSP Updater.
We have a number of stats already expressed using the statsd interface. During
the switchover period to direct Prometheus collection, we'd like to make those
stats available both ways. This change automatically exports any stats exported
using the statsd interface via Prometheus as well.
This is a little tricky because Prometheus expects all stats to by registered
exactly once. Prometheus does offer a mechanism to gracefully recover from
registering a stat more than once by handling a certain error, but it is not
safe for concurrent access. So I added a concurrency-safe wrapper that creates
Prometheus stats on demand and memoizes them.
In the process, made a few small required side changes:
- Clean "/" from method names in the gRPC interceptors. They are allowed in
statsd but not in Prometheus.
- Replace "127.0.0.1" with "boulder" as the name of our testing CT log.
Prometheus stats can't start with a number.
- Remove ":" from the CT-log stat names emitted by Publisher. Prometheus stats
can't include it.
- Remove a stray "RA" in front of some rate limit stats, since it was
duplicative (we were emitting "RA.RA..." before).
Note that this means two stat groups in particular are duplicated:
- Gostats* is duplicated with the default process-level stats exported by the
Prometheus library.
- gRPCClient* are duplicated by the stats generated by the go-grpc-prometheus
package.
When writing dashboards and alerts in the Prometheus world, we should be careful
to avoid these two categories, as they will disappear eventually. As a general
rule, if a stat is available with an all-lowercase name, choose that one, as it
is probably the Prometheus-native version.
In the long run we will want to create most stats using the native Prometheus
stat interface, since it allows us to use add labels to metrics, which is very
useful. For instance, currently our DNS stats distinguish types of queries by
appending the type to the stat name. This would be more natural as a label in
Prometheus.
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.
This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.
This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
This allows finer-grained control of which components can request issuance. The OCSP Updater should not be able to request issuance.
Also, update test/grpc-creds/generate.sh to reissue the certs properly.
Resolves#2417
This is a roll-forward of 5b865f1, with the QueueDeclare and QueueBind changes
in AMQP-RPC removed, and the startup order changes in test/startservers.py
removed. The AMQP-RPC changes caused RabbitMQ permission problems in production,
and the startup order changes depended on the AMQP-RPC changes but were not
required now that we have a unittest also.
This allows us to restart backends with relatively little interruption in
service, provided the backends come up promptly.
Fixes#2389 and #2408
There is now one file per service, containing both the client-side and
server-side wrappers for that service. This is a straight move of the code, with
the copyright, header comments, package statement, and imports copied into each
new file, and goimports run on the result.
Two custom errors were moved into bcodes.go.
Fixes#2388.
There's an off-the-shelf package that provides most of the stats we care about
for gRPC using interceptors. This change vendors go-grpc-prometheus and its
dependencies, and calls out to the interceptors provided by that package from
our own interceptors.
This will allow us to get metrics like latency histograms by call, status codes
by call, and so on.
Fixes#2390.
This change vendors go-grpc-prometheus and its dependencies. Per contributing guidelines, I've run the tests on these dependencies, and they pass:
go test github.com/davecgh/go-spew/spew github.com/grpc-ecosystem/go-grpc-prometheus github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto github.com/pmezard/go-difflib/difflib github.com/stretchr/testify/assert github.com/stretchr/testify/require github.com/stretchr/testify/suite
ok github.com/davecgh/go-spew/spew 0.022s
ok github.com/grpc-ecosystem/go-grpc-prometheus 0.120s
? github.com/grpc-ecosystem/go-grpc-prometheus/examples/testproto [no test files]
ok github.com/pmezard/go-difflib/difflib 0.042s
ok github.com/stretchr/testify/assert 0.021s
ok github.com/stretchr/testify/require 0.017s
ok github.com/stretchr/testify/suite 0.012s
This PR introduces the ability for the ocsp-updater to only resubmit certificates to logs that we are missing SCTs from. Prior to this commit when a certificate was missing one or more SCTs we would submit it to every log, causing unnecessary overhead for us and the log operator.
To accomplish this a new RPC endpoint is added to the Publisher service "SubmitToSingleCT". Unlike the existing "SubmitToCT" this RPC endpoint accepts a log URI and public key in addition to the certificate DER bytes. The certificate is submitted directly to that log, and a cache of constructed resources is maintained so that subsequent submissions to the same log can reuse the stat name, verifier, and submission client.
Resolves#1679
`grpc/creds:serverTransportCredentials.validateClient` is meant to ignore the check if the `acceptedSANs` map it is constructed with is `nil`. This never happens as the map is constructed using `make(map[string]struct{})` meaning it can never be `nil`.
Instead start with a `nil` map and only populate it if we have `ClientNames` to whitelist.
Fixes#2375.
Previously we had custom code in each gRPC wrapper to implement timeouts. Moving
the timeout code into the client interceptor allows us to simplify things and
reduce code duplication.
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.
Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.
Fixes#2347.
This PR updates our GRPC library dep. to v1.0.3. It's likely we can update to v1.0.4 without much effort but on a first attempt it seems that the SupportPackageIsVersion3 to SupportPackageIsVersion4 change might cause some headaches so I started with 1.0.3.
The grpc/creds.go serverTransportCredentials and clientTransportCredentials needed two new funcs (Clone and OverrideServerName) to conform to the updated credentials.TransportCredentials interface.
It's tempting to remove grpc/creds.go clientTransportCredentials entirely now that the TLSCredentials from upstream has a OverrideServerName function we can use, but unfortunately it only supports one hostname and must be called a-head of ClientHandshake that we still need clientTransportCredentials for our use-case. I tried this and failed, so clientTransportCredentials remains in bgrpc/creds/creds.go.
Per CONTRIBUTING.md I've verified the unit tests pass:
daniel@XXXXXX:~/go/src/google.golang.org/grpc$ git show -s
commit b7f1379d3cbbbeb2ca3405852012e237aa05459e
Merge: 33731fd bac9e1d
Author: Qi Zhao <toqizhao@gmail.com>
Date: Mon Oct 17 16:02:05 2016 -0700
Merge pull request #903 from improbable-io/blocking-graceful-shutdown-fix
Make concurrent Server.GracefulStop calls all behave equivalently.
daniel@XXXXXX:~/go/src/google.golang.org/grpc$ go test ./...
ok google.golang.org/grpc 0.215s
ok google.golang.org/grpc/benchmark 0.017s
? google.golang.org/grpc/benchmark/client [no test files]
? google.golang.org/grpc/benchmark/grpc_testing [no test files]
? google.golang.org/grpc/benchmark/server [no test files]
? google.golang.org/grpc/benchmark/stats [no test files]
? google.golang.org/grpc/benchmark/worker [no test files]
? google.golang.org/grpc/codes [no test files]
ok google.golang.org/grpc/credentials 0.041s
? google.golang.org/grpc/credentials/oauth [no test files]
? google.golang.org/grpc/examples/helloworld/greeter_client [no test files]
? google.golang.org/grpc/examples/helloworld/greeter_server [no test files]
? google.golang.org/grpc/examples/helloworld/helloworld [no test files]
? google.golang.org/grpc/examples/route_guide/client [no test files]
? google.golang.org/grpc/examples/route_guide/routeguide [no test files]
? google.golang.org/grpc/examples/route_guide/server [no test files]
ok google.golang.org/grpc/grpclb 0.047s
? google.golang.org/grpc/grpclb/grpc_lb_v1 [no test files]
? google.golang.org/grpc/grpclog [no test files]
? google.golang.org/grpc/grpclog/glogger [no test files]
? google.golang.org/grpc/health [no test files]
? google.golang.org/grpc/health/grpc_health_v1 [no test files]
? google.golang.org/grpc/internal [no test files]
? google.golang.org/grpc/interop [no test files]
? google.golang.org/grpc/interop/client [no test files]
? google.golang.org/grpc/interop/grpc_testing [no test files]
? google.golang.org/grpc/interop/server [no test files]
ok google.golang.org/grpc/metadata 0.004s
? google.golang.org/grpc/naming [no test files]
? google.golang.org/grpc/peer [no test files]
ok google.golang.org/grpc/reflection 0.029s
? google.golang.org/grpc/reflection/grpc_reflection_v1alpha [no test files]
? google.golang.org/grpc/reflection/grpc_testing [no test files]
? google.golang.org/grpc/stress/client [no test files]
? google.golang.org/grpc/stress/grpc_testing [no test files]
? google.golang.org/grpc/stress/metrics_client [no test files]
ok google.golang.org/grpc/test 94.693s
? google.golang.org/grpc/test/codec_perf [no test files]
? google.golang.org/grpc/test/grpc_testing [no test files]
ok google.golang.org/grpc/transport 12.574s
As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose.
This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names
The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field.
The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!).
An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate.
Resolves#2282
This commit updates the `go-jose` dependency to [v1.1.0](https://github.com/square/go-jose/releases/tag/v1.1.0) (Commit: aa2e30fdd1fe9dd3394119af66451ae790d50e0d). Since the import path changed from `github.com/square/...` to `gopkg.in/square/go-jose.v1/` this means removing the old dep and adding the new one.
The upstream go-jose library added a `[]*x509.Certificate` member to the `JsonWebKey` struct that prevents us from using a direct equality test against two `JsonWebKey` instances. Instead we now must compare the inner `Key` members.
The `TestRegistrationContactUpdate` function from `ra_test.go` was updated to populate the `Key` members used in testing instead of only using KeyID's to allow the updated comparisons to work as intended.
The `Key` field of the `Registration` object was switched from `jose.JsonWebKey` to `*jose.JsonWebKey ` to make it easier to represent a registration w/o a Key versus using a value with a nil `JsonWebKey.Key`.
I verified the upstream unit tests pass per contributing.md:
```
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ git show
commit aa2e30fdd1fe9dd3394119af66451ae790d50e0d
Merge: 139276c e18a743
Author: Cedric Staub <cs@squareup.com>
Date: Thu Sep 22 17:08:11 2016 -0700
Merge branch 'master' into v1
* master:
Better docs explaining embedded JWKs
Reject invalid embedded public keys
Improve multi-recipient/multi-sig handling
daniel@XXXXX:~/go/src/gopkg.in/square/go-jose.v1$ go test ./...
ok gopkg.in/square/go-jose.v1 17.599s
ok gopkg.in/square/go-jose.v1/cipher 0.007s
? gopkg.in/square/go-jose.v1/jose-util [no test files]
ok gopkg.in/square/go-jose.v1/json 1.238s
```
Based on experience with the new gRPC staging deployment. gRPC generates `FullMethod` names such as `-ServiceName-MethodName` which can be confusing. For client calls to a service we actually want something formatted like `ServiceName-MethodName` and for server requests we want just `MethodName`.
This PR adds a method to clean up the `FullMethod` names returned by gRPC and formats them the way we expect.
Fixes#1880.
Updates google.golang.org/grpc and github.com/jmhodges/clock, both test suites pass. A few of the gRPC interfaces changed so this also fixes those breakages.
While testing with real proxies I noticed the original CDR implementation was actually pretty broken, this refactors a bit and fixes a number of bugs. With this patch fallback to GPDNS over three distributed test proxies worked perfectly.
(Side note: `nginx` is not a viable forward proxy for this use as it doesn't support SSL, and a bunch of other _real_ forward proxy features, I ended up just using `squid3`.)
The main error in the previous implementation was the fallback was implemented in `getCAASet` which is only called in the old code path (the local CAA impl instead of the remote service) which mean't it wasn't actually being tested in the integration test. This also refactors a few repeated blocks into their own functions. Also there was a unicode encoding problem somewhere with the query string but for the life of me I can't figure out why it was broken now.
Adds a server side unary RPC interceptor which includes basic stats. We could also use this to add a server request ID to the context.Context to identify the call through the system, but really I'd rather do that on the client side before the RPC is sent which requires the client interceptor implementation upstream. Also updates google.golang.org/grpc.
Updates #1880.
- Run both gRPC and AMQP servers simultaneously
- Take explicit constructor parameters and unexport fields that were previously set by users
- Remove transitional DomainCheck code in RA now that GSB is enabled.
- Remove some leftover UpdateValidation dummy methods.
* Split out CAA checking service (minus logging etc)
* Add example.yml config + follow general Boulder style
* Update protobuf package to correct version
* Add grpc client to va
* Add TLS authentication in both directions for CAA client/server
* Remove go lint check
* Add bcodes package listing custom codes for Boulder
* Add very basic (pull-only) gRPC metrics to VA + caa-service