Commit Graph

36 Commits

Author SHA1 Message Date
Jacob Hoffman-Andrews 4670be1210 Reduce log level for WFE in tests. (#3918)
Our Travis output is quite verbose with the WFE output, and it's very
rare that we have to reference it. I'd like to remove the INFO-level
logs (i.e. the logs of every request) so that it's easier to see real
errors, and faster to scroll to the bottom of logs of failed runs.
2018-11-01 09:50:41 -04:00
Roland Bracewell Shoemaker 876c727b6f Update gRPC (#3817)
Fixes #3474.
2018-08-20 10:55:42 -04:00
Daniel McCarney bbf0102cdc
Remove UseAIAIssuerURL feature flag and code. (#3790)
We aren't going to deploy this as-is and its causing integration test
problems for downstream clients.
2018-07-03 16:29:44 -04:00
Jacob Hoffman-Andrews dbcb16543e Start using multiple-IP hostnames for load balancing (#3687)
We'd like to start using the DNS load balancer in the latest version of gRPC. That means putting all IPs for a service under a single hostname (or using a SRV record, but we're not taking that path). This change adds an sd-test-srv to act as our service discovery DNS service. It returns both Boulder IP addresses for any A lookup ending in ".boulder". This change also sets up the Docker DNS for our boulder container to defer to sd-test-srv when it doesn't know an answer.

sd-test-srv doesn't know how to resolve public Internet names like `github.com`. Resolving public names is required for the `godep-restore` test phase, so this change breaks out a copy of the boulder container that is used only for `godep-restore`.

This change implements a shim of a DNS resolver for gRPC, so that we can switch to DNS-based load balancing with the currently vendored gRPC, then when we upgrade to the latest gRPC we won't need a simultaneous config update.

Also, this change introduces a check at the end of the integration test that each backend received at least one RPC, ensuring that we are not sending all load to a single backend.
2018-05-23 09:47:14 -04:00
Jacob Hoffman-Andrews a4421ae75b Run gRPC backends on multiple IPs instead of multiple ports (#3679)
We're currently stuck on gRPC v1.1 because of a breaking change to certificate validation in gRPC 1.8. Our gRPC balancer uses a static list of multiple hostnames, and expects to validate against those hostnames. However gRPC expects that a service is one hostname, with multiple IP addresses, and validates all those IP addresses against the same hostname. See grpc/grpc-go#2012.

If we follow gRPC's assumptions, we can rip out our custom Balancer and custom TransportCredentials, and will probably have a lower-friction time in general.

This PR is the first step in doing so. In order to satisfy the "multiple IPs, one port" property of gRPC backends in our Docker container infrastructure, we switch to Docker's user-defined networking. This allows us to give the Boulder container multiple IP addresses on different local networks, and gives it different DNS aliases in each network.

In startservers.py, each shard of a service listens on a different DNS alias for that service, and therefore a different IP address. The listening port for each shard of a service is now identical.

This change also updates the gRPC service certificates. Now, each certificate that is used in a gRPC service (as opposed to something that is "only" a client) has three names. For instance, sa1.boulder, sa2.boulder, and sa.boulder (the generic service name). For now, we are validating against the specific hostnames. When we update our gRPC dependency, we will begin validating against the generic service name.

Incidentally, the DNS aliases feature of Docker allows us to get rid of some hackery in entrypoint.sh that inserted entries into /etc/hosts.

Note: Boulder now has a dependency on the DNS aliases feature in Docker. By default, docker-compose run creates a temporary container and doesn't assign any aliases to it. We now need to specify docker-compose run --use-aliases to get the correct behavior. Without --use-aliases, Boulder won't be able to resolve the hostnames it wants to bind to.
2018-05-07 10:38:31 -07:00
Daniel McCarney 054f181458 load-generator: send correct ACMEv2 Content-Type on POST (#3667)
load generator: send correct ACMEv2 Content-Type on POST.

This PR updates the Boulder load-generator to send the correct ACMEv2 Content-Type header when POSTing the ACME server. This is required for ACMEv2 and without it all POST requests to the WFE2 running a test/config-next configuration result in malformed 400 errors. While only required by ACMEv2 this commit sends it for ACMEv1 requests as well. No harm no foul.

integration tests: allow running just the load generator.
Prior to this PR an omission in an if statement in integration-test.py meant that you couldn't invoke test/integration-test.py with just the --load argument to only run the load generator. This commit updates the if to allow this use case.
2018-05-01 12:22:43 -07:00
Roland Bracewell Shoemaker 0a86573a73 Update integration tests 2018-04-20 13:18:40 -07:00
Jacob Hoffman-Andrews a4f9de9e35 Improve nesting of RPC deadlines (#3619)
gRPC passes deadline information through the RPC boundary, but client and server have the same deadline. Ideally we'd like the server to have a slightly tighter deadline than the client, so if one of the server's onward RPCs or other network calls times out, the server can pass back more detailed information to the client, rather than the client timing out the server and losing the opportunity to log more detailed information about which component caused the timeout.

In this change, I subtract 100ms from the deadline on the server side of our interceptors, using our existing serverInterceptor. I also check that there is at least 100ms remaining in which to do useful work, so the server doesn't begin a potentially expensive task only to abort it.

Fixes #3608.
2018-04-06 15:40:18 +01:00
Daniel McCarney 17922a6d2d
Add CAAIdentities and Website to /directory "meta". (#3588)
This commit updates the WFE and WFE2 to have configuration support for
setting a value for the `/directory` object's "meta" field's
optional "caaIdentities" and "website" fields. The config-next wfe/wfe2
configuration are updated with values for these fields. Unit tests are
updated to check that they are sent when expected and not otherwise.

Bonus content: The `test.AssertUnmarshaledEquals` function had a bug
where it would consider two inputs equal when the # of keys differed.
This commit also fixes that bug.
2018-03-22 16:12:43 -04:00
Jacob Hoffman-Andrews c556a1a20d
Reduce spurious errors in integration test (#3436)
Boulder is fairly noisy about gRPC connection errors. This is a mixed
blessing: Our gRPC configuration will try to reconnect until it hits
an RPC deadline, and most likely eventually succeed. In that case,
we don't consider those to really be errors. However, in cases where
a connection is repeatedly failing, we'd like to see errors in the
logs about connection failure, rather than "deadline exceeded." So
we want to keep logging of gRPC errors.

However, right now we get a lot of these errors logged during
integration tests. They make the output hard to read, and may disguise
more serious errors. So we'd like to avoid causing such errors in
normal integration test operation.

This change reorders the startup of Boulder components by their gRPC
dependencies, so everything's backend is likely to be up and running
before it starts. It also reverses that order for clean shutdowns,
and waits for each process to exit before signalling the next one.

With these changes, I still got connection errors. Taking listenbuddy
out of the gRPC path fixed them. I believe the issue is that
listenbuddy is not a truly transparent proxy. In particular, it
accepts an inbound TCP connection before opening an outbound TCP
connection. If opening that outbound connection results in "connection
refused," it closes the inbound connection. That means gRPC sees a
"connection closed" (or "connection reset"?) rather than "connection
refused". I'm guessing it handles those cases differently, explaining
the different error results.

We've been using listenbuddy to trigger disconnects while Boulder is
running, to ensure that gRPC's reconnect code works. I think we can
probably rely on gRPC's reconnect to work. The initial problem that
led us to start testing this was a configuration problem; now that
we have the configuration we want, we should be fine and don't need
to keep testing reconnects on every integration test run.
2018-02-12 18:17:50 -08:00
Roland Bracewell Shoemaker fc5c8f76b6 Remove unused features (#3393)
This removes a number of unused features (i.e. they are never checked anywhere).
2018-01-25 08:55:05 -05:00
Jacob Hoffman-Andrews 827f7859f2 Fix issuerCert in test configs. (#3310)
Previously, there was a disagreement between WFE and CA as to what the correct
issuer certificate was. Consolidate on test-ca2.pem (h2ppy h2cker fake CA).
    
Also, the CA configs contained an outdated entry for "IssuerCert", which was not
being used: The CA configs now use an "Issuers" array to allow signing by
multiple issuer certificates at once (for instance when rolling intermediates).
Removed this outdated entry, and the config code for CA to load it. I've
confirmed these changes match what is currently in production.

Added an integration test to check for this problem in the future.

Fixes #3309, thanks to @icing for bringing the issue to our attention!

This also includes changes from #3321 to clarify certificates for WFE.
2018-01-09 07:56:39 -05:00
Jacob Hoffman-Andrews cd49316493 Do ROCAChecks by default. (#3283)
This feature flag has been enabled in prod, and we don't expect to want to
turn it off any time soon.
2017-12-15 13:44:39 -08:00
Daniel McCarney 55dd1020c0 Increase VA SingleDialTimeout to 10s. (#3260)
This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously.

This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.
2017-12-04 09:53:26 -08:00
Jacob Hoffman-Andrews 5df083a57e Add ROCA weak key checking (#3189)
Thanks to @titanous for the library!
2017-11-02 08:42:59 -04:00
Jacob Hoffman-Andrews 071fc0120f Remove facebookgo/httpdown. (#3168)
Its purpose is now served by net/http's Shutdown().
2017-10-17 08:55:43 -04:00
Jacob Hoffman-Andrews 8afec60433 Remove unneeded dns config value for WFE. (#3057) 2017-09-08 14:32:36 -04:00
Jacob Hoffman-Andrews b17b5c72a6 Remove statsd from Boulder (#2752)
This removes the config and code to output to statsd.

- Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`.
- Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be.
- Delete vendored statsd client.
- Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere.
- Remove a few unused methods on `metrics.Scope`, and update its generated mock.
- Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given.
- Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly.

Fixes #2639, #2733.
2017-05-15 10:19:54 -04:00
Daniel 8c547473b8
Adds "meta" entry w/ ToS to /directory.
This commit adds the acme draft-02+ optional "meta" element for the
/directory response. Presently we only include the optional
"terms-of-service" URL. Whether the meta entry is included is controlled
by two factors:

  1. The state of the "DirectoryMeta" feature flag, which defaults to
     off
  2. Whether the client advertises the UA we know to be intolerant of
     new directory entries.

The TestDirectory unit test is updated to test both states of the flag
and the UA detection.
2017-05-09 15:43:16 -04:00
Roland Bracewell Shoemaker 730318a755 Add GREASE to directory (#2731)
Randomly generates and adds a key to the directory object with the value grease.

Fixes #2415.
2017-05-08 14:13:35 -07:00
Roland Bracewell Shoemaker a46d30945c Purge remaining AMQP code (#2648)
Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic.

+49 -8,221 😁

Fixes #2640 and #2562.
2017-04-04 15:02:22 -07:00
Patrick Figel 6ba8aadfd7 Use X.509 AIA Issuer URL in rel="up" link header (#2545)
In order to provide the correct issuer certificate for older certificates after an issuer certificate rollover or when using multiple issuer certificates (e.g. RSA and ECDSA), use the AIA CA Issuer URL embedded in the certificate for the rel="up" link served by WFE. This behaviour is gated behind the UseAIAIssuerURL feature, which defaults to false.

To prevent MitM vulnerabilities in cases where the AIA URL is HTTP-only, it is upgraded to HTTPS.

This also adds a test for the issuer URL returned by the /acme/cert endpoint. wfe/test/178.{crt,key} were regenerated to add the AIA extension required to pass the test.

/acme/cert was changed to return an absolute URL to the issuer endpoint (making it consistent with /acme/new-cert).

Fixes #1663
Based on #1780
2017-02-07 11:19:22 -08:00
Jacob Hoffman-Andrews 510e279208 Simplify gRPC TLS configs. (#2470)
Previously, a given binary would have three TLS config fields (CA cert, cert,
key) for its gRPC server, plus each of its configured gRPC clients. In typical
use, we expect all three of those to be the same across both servers and clients
within a given binary.

This change reuses the TLSConfig type already defined for use with AMQP, adds a
Load() convenience function that turns it into a *tls.Config, and configures it
for use with all of the binaries. This should make configuration easier and more
robust, since it more closely matches usage.

This change preserves temporary backwards-compatibility for the
ocsp-updater->publisher RPCs, since those are the only instances of gRPC
currently enabled in production.
2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews 0c665b2053 Split up gRPC certificates by service. (#2453)
Previously, all gRPC services used the same client and server certificates. Now,
each service has its own certificate, which it uses for both client and server
authentication, more closely simulating production.

This also adds aliases for each of the relevant hostnames in /etc/hosts. There
may be some issues if Docker decides to rewrite /etc/hosts while Boulder is
running, but this seems to work for now.
2016-12-29 14:53:59 -08:00
Jacob Hoffman-Andrews 1c1449b284 Improvements to tests and test configs. (#2396)
- Remove spinner from test.js. It made Travis logs hard to read.
- Listen on all interfaces for debugAddr. This makes it possible to check
  Prometheus metrics for instances running in a Docker container.
- Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures
  DNS completes within the relevant RPC timeouts.
- Remove RA service queue from VA, since VA no longer uses the callback to RA on
  completing a challenge.
2016-12-05 14:35:27 -08:00
Roland Bracewell Shoemaker 03fdd65bfe Add gRPC server to SA (#2374)
Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer.

Also adds a CA gRPC client to the OCSP Updater which was missed in #2193.

Fixes #2347.
2016-12-02 17:24:46 -08:00
Roland Bracewell Shoemaker a87379bc6e Add gRPC server to RA (#2350)
Fixes #2348.
2016-11-29 15:34:35 -08:00
Jacob Hoffman-Andrews 1df986b858 Remove CheckMalformedCSR feature flag. (#2370)
This is now enabled in prod and can default to enabled.
2016-11-29 17:05:05 -05:00
Roland Bracewell Shoemaker ce679bad41 Implement key rollover (#2231)
Fixes #503.

Functionality is gated by the feature flag `AllowKeyRollover`. Since this functionality is only specified in ACME draft-03 and we mostly implement the draft-02 style this takes some liberties in the implementation, which are described in the updated divergences doc. The `key-change` resource is used to side-step draft-03 `url` requirement.
2016-10-27 10:22:09 -04:00
Roland Bracewell Shoemaker 9648e1cf85 Fix config-next features location and registration status validity check (#2225)
Move features sections to the correct JSON object and only test registration validity if regCheck is true

* Pull other flag up to correct level

* Only check status update when status is non-empty
2016-10-05 12:31:59 -04:00
Roland Bracewell Shoemaker c6e3ef660c Re-apply 2138 with proper gating (#2199)
Re-applies #2138 using the new style of feature-flag gated migrations. Account deactivation is gated behind `features.AllowAccountDeactivation`.
2016-09-29 17:16:03 -04:00
Roland Bracewell Shoemaker 51ee04e6a9 Allow authorization deactivation (#2116)
Implements `valid` and `pending` authz deactivation.
2016-08-23 16:25:06 -04:00
Roland Bracewell Shoemaker fc39781274 Allow user specified revocation reason (#2089)
Fixes #140.

This patch allows users to specify the following revocation reasons based on my interpretation of the meaning of the codes but could use confirmation from others.

* unspecified (0)
* keyCompromise (1)
* affiliationChanged (3)
* superseded (4)
* cessationOfOperation (5)
2016-08-08 14:26:52 -07:00
Ben Irving b587d4e663 Simplify KeyPolicy code (#2092)
This PR, removes the allowedSigningAlgos configuration struct and hard codes a key policy.

Fixes #1844
2016-07-30 16:15:19 -07:00
Roland Bracewell Shoemaker 04961d7c66 Add basic ASN.1 structure test for pre-1.0.2 OpenSSL CSRs (#1972)
Adds a test for CSRs generated using a pre-1.0.2 version of OpenSSL and a buggy client which will fail to parse with Golang 1.6+.

This test checks the values of the bytes in the 8th and 9th offsets, which in a properly formatted CSR should be the version integer declaration bytes, and if the malformed values are present will return a error to the user informing them that they are using an old version of OpenSSL and/or a client which doesn't explicitly set the CSR version.

Fixes #1902.
2016-06-28 12:38:52 -07:00
Ben Irving 6007df8f3c Split up boulder-config.json (WFE) (#1973)
Moves the wfe to it's own config file.

Each config will now belong in `test/config` and `test/config-next` analogous to `boulder-config` and `boulder-config-next`.
2016-06-28 10:40:16 -07:00