Commit Graph

75 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews 11434650b7 Check safe browsing at validation time (#3539)
Right now we check safe browsing at new-authz time, which introduces a possible
external dependency when calling new-authz. This is usually fine, since most safe
browsing checks can be satisfied locally, but when requests have to go external,
it can create variance in new-authz timing.

Fixes #3491.
2018-03-09 11:15:05 +00:00
Jacob Hoffman-Andrews c556a1a20d
Reduce spurious errors in integration test (#3436)
Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.

However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.

This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.

With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.

We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.
2018-02-12 18:17:50 -08:00
Jacob Hoffman-Andrews f16c3af335 Active UDPDNS by default. (#3285)
Active UDPDNS by default
2017-12-15 12:26:45 -08:00
Roland Bracewell Shoemaker bdea281ae0 Remove CAA SERVFAIL exceptions code (#3262)
Fixes #3080.
2017-12-05 14:39:37 -08:00
Jacob Hoffman-Andrews 2fd2f9e230 Remove LegacyCAA implementation. (#3240)
Fixes #3236
2017-11-20 16:09:00 -05:00
Jacob Hoffman-Andrews 4e68fb2ff6 Switch to udp for internal DNS. (#3135)
We used to use TCP because we would request DNSSEC records from Unbound, and
they would always cause truncated records when present. Now that we no longer
request those (#2718), we can use UDP. This is better because the TCP serving
paths in Unbound are likely less thoroughly tested, and not optimized for high
load. In particular this may resolve some availability problems we've seen
recently when trying to upgrade to a more recent Unbound.

Note that this only affects the Boulder->Unbound path. The Unbound->upstream
path is already UDP by default (with TCP fallback for truncated ANSWERs).
2017-10-10 10:06:33 -04:00
Daniel McCarney 1794c56eb8 Revert "Add CAA parameter to restrict challenge type (#3003)" (#3145)
This reverts commit 23e2c4a836.
2017-10-04 12:00:44 -07:00
lukaslihotzki 23e2c4a836 Add CAA parameter to restrict challenge type (#3003)
This commit adds CAA `issue` paramter parsing and the `challenge` parameter to permit a single challenge type only. By setting `challenge=dns-01`, the nameserver keeps control over every issued certificate.
2017-10-02 11:59:47 -07:00
Jacob Hoffman-Andrews 4266853092 Implement legacy form of CAA (#3075)
This implements the pre-erratum 5065 version of CAA, behind a feature flag.

This involved refactoring DNSClient.LookupCAA to return a list of CNAMEs in addition to the CAA records, and adding an alternate lookuper that does tree-climbing on single-depth aliases.
2017-09-13 10:16:12 -04:00
Daniel McCarney 57252c3b07 Remove letsencrypt/go-safe-browsing-api dependency. (#2905)
We have migrated from our fork of `go-safe-browsing-api` to Google's
safebrowsing v4 library. This commit removes the legacy safe browsing
code.
2017-07-26 13:57:57 -07:00
Roland Bracewell Shoemaker 088b872287 Implement multi VA validation (#2802)
Adds basic multi-path validation functionality. A new method `performRemoteValidation` is added to `boulder-va` which is called if it is configured with a list of remote VA gRPC addresses. In this initial implementation the remote VAs are only used to check the validation result of the main VA, if all of the remote validations succeed but the local validation failed, the overall validation will still fail. Remote VAs use the exact same code as the local VA to perform validation. If the local validation succeeds then a configured quorum of the remote VA successes must be met in order to fully complete the validation.

This implementation assumes that metrics are collected from the remote VAs in order to have visibility into their individual validation latencies etc.

Fixes #2621.
2017-06-29 14:11:01 -07:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Daniel McCarney 47452d6c6c Prefer IPv6 addresses, fall back to IPv4. (#2715)
This PR introduces a new feature flag "IPv6First". 

When the "IPv6First" feature is enabled the VA's HTTP dialer and TLS SNI
(01 and 02) certificate fetch requests will attempt to automatically
retry when the initial connection was to IPv6 and there is an IPv4
address available to retry with.

This resolves https://github.com/letsencrypt/boulder/issues/2623
2017-05-08 13:00:16 -07:00
Jacob Hoffman-Andrews 6719dc17a6 Remove AMQP config and code (#2634)
We now use gRPC everywhere.
2017-04-03 10:39:39 -04:00
Daniel McCarney 15e73edc5a Google Safe Browsing V4 Improvements (#2504)
This PR has three primary contributions:

1. The existing code for using the V4 safe browsing API introduced in #2446 had some bugs that are fixed in this PR.
2. A gsb-test-srv is added to provide a mock Google Safebrowsing V4 server for integration testing purposes.
3. A short integration test is added to test end-to-end GSB lookup for an "unsafe" domain.

For 1) most notably Boulder was assuming the new V4 library accepted a directory for its database persistence when it instead expects an existing file to be provided. Additionally the VA wasn't properly instantiating feature flags preventing the V4 api from being used by the VA.

For 2) the test server is designed to have a fixed set of "bad" domains (Currently just honest.achmeds.discount.hosting.com). When asked for a database update by a client it will package the list of bad domains up & send them to the client. When the client is asked to do a URL lookup it will check the local database for a matching prefix, and if found, perform a lookup against the test server. The test server will process the lookup and increment a count for how many times the bad domain was asked about.

For 3) the Boulder startservers.py was updated to start the gsb-test-srv and the VA is configured to talk to it using the V4 API. The integration test consists of attempting issuance for a domain pre-configured in the gsb-test-srv as a bad domain. If the issuance succeeds we know the GSB lookup code is faulty. If the issuance fails, we check that the gsb-test-srv received the correct number of lookups for the "bad" domain and fail if the expected isn't reality.

Notes for reviewers:

* The gsb-test-srv has to be started before anything will use it. Right now the v4 library handles database update request failures poorly and will not retry for 30min. See google/safebrowsing#44 for more information.
* There's not an easy way to test for "good" domain lookups, only hits against the list. The design of the V4 API is such that a list of prefixes is delivered to the client in the db update phase and if the domain in question matches no prefixes then the lookup is deemed unneccesary and not performed. I experimented with sending 256 1 byte prefixes to try and trick the client to always do a lookup, but the min prefix size is 4 bytes and enumerating all possible prefixes seemed gross.
* The test server has a /add endpoint that could be used by integration tests to add new domains to the block list, but it isn't being used presently. The trouble is that the client only updates its database every 30 minutes at present, and so adding a new domain will only take affect after the client updates the database.

Resolves #2448
2017-01-23 11:07:20 -08:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews 089a270453 Add instructions on load testing OCSP generation. (#2459) 2017-01-02 11:36:03 -08:00
Jacob Hoffman-Andrews 0c665b2053 Split up gRPC certificates by service. (#2453)
Previously, all gRPC services used the same client and server certificates. Now,
each service has its own certificate, which it uses for both client and server
authentication, more closely simulating production.

This also adds aliases for each of the relevant hostnames in /etc/hosts. There
may be some issues if Docker decides to rewrite /etc/hosts while Boulder is
running, but this seems to work for now.
2016-12-29 14:53:59 -08:00
Daniel McCarney 74e281c1ce Switch to Google's v4 safebrowsing library. (#2446)
Right now we are using a third-party client for the Google Safe Browsing API, but Google has recently released their own [Golang library](https://github.com/google/safebrowsing) which also supports the newer v4 API. Using this library will let us avoid fixing some lingering race conditions & unpleasantness in our fork of `go-safebrowsing-api`.

This PR adds support for using the Google library & the v4 API in place of our existing fork when the `GoogleSafeBrowsingV4` feature flag is enabled in the VA "features" configuration.

Resolves https://github.com/letsencrypt/boulder/issues/1863

Per `CONTRIBUTING.md` I also ran the unit tests for the new dependency:
```
daniel@XXXXXXXXXX:~/go/src/github.com/google/safebrowsing$ go test ./...
ok  	github.com/google/safebrowsing	3.274s
?   	github.com/google/safebrowsing/cmd/sblookup	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver	[no test files]
?   	github.com/google/safebrowsing/cmd/sbserver/statik	[no test files]
?   	github.com/google/safebrowsing/internal/safebrowsing_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb	0.012s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/jsonpb/jsonpb_test_proto	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto	0.062s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/proto/proto3_proto	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/descriptor	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/generator	0.017s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/grpc	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/protoc-gen-go/plugin	[no test files]
ok  	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes	0.009s
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/any	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/duration	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/empty	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/struct	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/timestamp	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/golang/protobuf/ptypes/wrappers	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik	[no test files]
?   	github.com/google/safebrowsing/vendor/github.com/rakyll/statik/fs	[no test files]
ok  	github.com/google/safebrowsing/vendor/golang.org/x/net/idna	0.003s
```
2016-12-27 11:18:11 -05:00
Jacob Hoffman-Andrews 1c1449b284 Improvements to tests and test configs. (#2396)
- Remove spinner from test.js. It made Travis logs hard to read.
- Listen on all interfaces for debugAddr. This makes it possible to check
  Prometheus metrics for instances running in a Docker container.
- Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures
  DNS completes within the relevant RPC timeouts.
- Remove RA service queue from VA, since VA no longer uses the callback to RA on
  completing a challenge.
2016-12-05 14:35:27 -08:00
Roland Bracewell Shoemaker e2155388a1 Remove caa-checker from the tree (#2351)
The VA can internally check CAA and this additional code was deemed unneeded complexity that could be hoisted outside of Boulder. Fixes #2346.
2016-11-23 08:42:33 -05:00
Daniel McCarney 6c983e8c9e Implements client whitelisting for gRPC. (#2307)
As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose.

This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names

The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field.

The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!).

An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate.

Resolves #2282
2016-11-08 13:57:34 -05:00
Daniel McCarney 409f1623e6 Retires `LookupIPv6` VA flag. (#2205)
The LookupIPv6 flag has been enabled in production and isn't required anymore. This PR removes the flag entirely.

The errA and errAAAA error handling in LookupHost is left as-is, meaning that a non-nil errAAAA will not be returned to the caller. This matches the existing behaviour, and the expectations of the TestDNSLookupHost unit tests.

This commit also removes the tests from TestDNSLookupHost that tested the LookupIPv6 == false behaviours since those are no longer implemented.

Resolves #2191
2016-09-26 18:00:01 -07:00
Roland Bracewell Shoemaker 239bf9ae0a Very basic feature flag impl (#1705)
Updates #1699.

Adds a new package, `features`, which exposes methods to set and check if various internal features are enabled. The implementation uses global state to store the features so that services embedded in another service do not each require their own features map in order to check if something is enabled.

Requires a `boulder-tools` image update to include `golang.org/x/tools/cmd/stringer`.
2016-09-20 16:29:01 -07:00
Ben Irving bea8e57536 Split up boulder-config.json (VA) (#1979) 2016-07-01 13:06:50 -04:00