boulder

Commit Graph

Author	SHA1	Message	Date
Jacob Hoffman-Andrews	d1e6d0f190	Remove TLS-SNI-01 (#4114 ) * Remove the challenge whitelist * Reduce the signature for ChallengesFor and ChallengeTypeEnabled * Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect are now tested. Fixes #4109	2019-03-15 09:05:24 -04:00
Daniel McCarney	279947ade2	CI/Devenv: restore 20s RA->VA timeout. (#4084 ) I tried dropping the RA->VA timeout to make the `test_http_challenge_timeout` integration test faster. It seems to flake in CI so I'm restoring the original 20s timeout. This makes `test_http_challenge_timeout` slower but c'est la vie.	2019-02-22 08:53:18 -08:00
Daniel McCarney	3324989205	CI/Dev: Increase RA->VA timeout to 8s. (#4062 ) There has been some flakyness in CI related to RA->VA timeouts.	2019-02-15 13:38:12 -08:00
Daniel McCarney	1c0be52e53	VA: Add integration test for HTTP timeouts. (#4050 ) Also update `TestHTTPTimeout` to test with the `SimplifiedVAHTTP` feature flag enabled.	2019-02-12 13:42:01 -08:00
Daniel McCarney	d1daeee831	Config: serverAddresses -> serverAddress. (#4035 ) The plural `serverAddresses` field in gRPC config has been deprecated for a bit now. We've removed the last usages of it in our staging/prod environments and can clear out the related code. Moving forward we only support a singular `serverAddress` and rely on DNS to direct to multiple instances of a given server.	2019-01-25 10:50:53 -08:00
Jacob Hoffman-Andrews	92e8e1708a	Update config and config-next challenge settings. (#4017 ) - Allow tls-alpn-01 challenge in config. - Disallow tls-sni-01 challenge in config-next. - Remove gating of tls-alpn integration test. - Remove TLSSNIRevalidation in config-next.	2019-01-18 10:30:38 -08:00
Roland Bracewell Shoemaker	842739bccd	Remove deprecated features that have been purged from prod and staging configs (#4001 )	2019-01-15 16:16:35 -08:00
Roland Bracewell Shoemaker	e27f370fd3	Excise code relating to pre-SCT embedding issuance flow (#3769 ) Things removed: * features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc) * ca.enablePrecertificateFlow (and all the associated RA/CA code) * sa.AddSCTReceipt and sa.GetSCTReceipt RPCs * publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs Fixes #3755.	2018-06-28 08:33:05 -04:00
Jacob Hoffman-Andrews	b2f5cf39b9	Bring test/config up to date with test/config-next (#3743 ) Notably, enable the precertificate flow, RPCHeadroom, and multi-IP hostnames. Lots of other changes and feature flags too.	2018-06-01 12:00:52 -07:00
Daniel McCarney	76a3f4a18f	RA/CA: Use `doNotForceCN: false` for `test/config`. (#3698 ) In staging/prod we use `doNotForceCN: false` for both the RA & CA config. Switching this to `true` is blocked on CABF work that will likely take considerable time. In the short-term we should use `doNotForceCN: false` in `test/config` and only use `doNotForceCN: true` in `test/config-next`.	2018-05-09 12:54:16 -07:00
Jacob Hoffman-Andrews	a4421ae75b	Run gRPC backends on multiple IPs instead of multiple ports (#3679 ) We're currently stuck on gRPC v1.1 because of a breaking change to certificate validation in gRPC 1.8. Our gRPC balancer uses a static list of multiple hostnames, and expects to validate against those hostnames. However gRPC expects that a service is one hostname, with multiple IP addresses, and validates all those IP addresses against the same hostname. See grpc/grpc-go#2012. If we follow gRPC's assumptions, we can rip out our custom Balancer and custom TransportCredentials, and will probably have a lower-friction time in general. This PR is the first step in doing so. In order to satisfy the "multiple IPs, one port" property of gRPC backends in our Docker container infrastructure, we switch to Docker's user-defined networking. This allows us to give the Boulder container multiple IP addresses on different local networks, and gives it different DNS aliases in each network. In startservers.py, each shard of a service listens on a different DNS alias for that service, and therefore a different IP address. The listening port for each shard of a service is now identical. This change also updates the gRPC service certificates. Now, each certificate that is used in a gRPC service (as opposed to something that is "only" a client) has three names. For instance, sa1.boulder, sa2.boulder, and sa.boulder (the generic service name). For now, we are validating against the specific hostnames. When we update our gRPC dependency, we will begin validating against the generic service name. Incidentally, the DNS aliases feature of Docker allows us to get rid of some hackery in entrypoint.sh that inserted entries into /etc/hosts. Note: Boulder now has a dependency on the DNS aliases feature in Docker. By default, docker-compose run creates a temporary container and doesn't assign any aliases to it. We now need to specify docker-compose run --use-aliases to get the correct behavior. Without --use-aliases, Boulder won't be able to resolve the hostnames it wants to bind to.	2018-05-07 10:38:31 -07:00
Roland Bracewell Shoemaker	0a86573a73	Update integration tests	2018-04-20 13:18:40 -07:00
Jacob Hoffman-Andrews	4a961c3bc8	Ungate config-next for wfe2 and Wildcards.	2018-03-14 13:18:37 -07:00
Roland Bracewell Shoemaker	9c9e944759	Add SCT embedding (#3521 ) Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs. Fixes #2244, fixes #3492 and fixes #3429.	2018-03-12 11:58:30 -07:00
Roland Bracewell Shoemaker	8446571b46	Remove EnforceChallengeDisable (#3444 ) Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag. This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation. Fixes #3441.	2018-02-14 13:21:26 -08:00
Jacob Hoffman-Andrews	c556a1a20d	Reduce spurious errors in integration test (#3436 ) Boulder is fairly noisy about gRPC connection errors. This is a mixed blessing: Our gRPC configuration will try to reconnect until it hits an RPC deadline, and most likely eventually succeed. In that case, we don't consider those to really be errors. However, in cases where a connection is repeatedly failing, we'd like to see errors in the logs about connection failure, rather than "deadline exceeded." So we want to keep logging of gRPC errors. However, right now we get a lot of these errors logged during integration tests. They make the output hard to read, and may disguise more serious errors. So we'd like to avoid causing such errors in normal integration test operation. This change reorders the startup of Boulder components by their gRPC dependencies, so everything's backend is likely to be up and running before it starts. It also reverses that order for clean shutdowns, and waits for each process to exit before signalling the next one. With these changes, I still got connection errors. Taking listenbuddy out of the gRPC path fixed them. I believe the issue is that listenbuddy is not a truly transparent proxy. In particular, it accepts an inbound TCP connection before opening an outbound TCP connection. If opening that outbound connection results in "connection refused," it closes the inbound connection. That means gRPC sees a "connection closed" (or "connection reset"?) rather than "connection refused". I'm guessing it handles those cases differently, explaining the different error results. We've been using listenbuddy to trigger disconnects while Boulder is running, to ensure that gRPC's reconnect code works. I think we can probably rely on gRPC's reconnect to work. The initial problem that led us to start testing this was a configuration problem; now that we have the configuration we want, we should be fine and don't need to keep testing reconnects on every integration test run.	2018-02-12 18:17:50 -08:00
Roland Bracewell Shoemaker	fc5c8f76b6	Remove unused features (#3393 ) This removes a number of unused features (i.e. they are never checked anywhere).	2018-01-25 08:55:05 -05:00
Daniel McCarney	c6d56b7a84	Match RA `authorizationLifetimeDays` to prod. (#3370 )	2018-01-16 10:39:57 -08:00
Jacob Hoffman-Andrews	198fd1426a	Bring config up-to-date with prod. (#3359 ) This brings in some changes from config-next that are now live in production.	2018-01-11 16:29:41 -05:00
Daniel McCarney	55dd1020c0	Increase VA SingleDialTimeout to 10s. (#3260 ) This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously. This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.	2017-12-04 09:53:26 -08:00
Jacob Hoffman-Andrews	0a64fd4066	Bring test/config up-to-date. (#3056 ) Methodology: Copy test/config-next/* into test/config/, then manually review the diffs, removing any diffs that are not yet in production.	2017-09-11 16:55:58 -04:00
Jacob Hoffman-Andrews	3431acfb92	Adjust testing maxNames config to match prod. (#2911 )	2017-07-27 15:23:29 -07:00
Roland Bracewell Shoemaker	a46d30945c	Purge remaining AMQP code (#2648 ) Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic. +49 -8,221 😁 Fixes #2640 and #2562.	2017-04-04 15:02:22 -07:00
Jacob Hoffman-Andrews	6719dc17a6	Remove AMQP config and code (#2634 ) We now use gRPC everywhere.	2017-04-03 10:39:39 -04:00
Jacob Hoffman-Andrews	cbde78d58f	Harmonize and tweak configs (#2479 ) Set authorizationLifetimeDays to 60 across both config and config-next. Set NumSessions to 2 in both config and config-next. A decrease from 10 because pkcs11-proxy (or pkcs11-daemon?) seems to error out under load if you have more sessions than CPUs. Reorder parallelGenerateOCSPRequests to match config-next. Remove extra tags for parsing yaml in config objects.	2017-01-10 13:46:38 -08:00
Jacob Hoffman-Andrews	089a270453	Add instructions on load testing OCSP generation. (#2459 )	2017-01-02 11:36:03 -08:00
Jacob Hoffman-Andrews	1c1449b284	Improvements to tests and test configs. (#2396 ) - Remove spinner from test.js. It made Travis logs hard to read. - Listen on all interfaces for debugAddr. This makes it possible to check Prometheus metrics for instances running in a Docker container. - Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures DNS completes within the relevant RPC timeouts. - Remove RA service queue from VA, since VA no longer uses the callback to RA on completing a challenge.	2016-12-05 14:35:27 -08:00
Daniel McCarney	eb67ad4f88	Allow `validateEmail` to timeout w/o error. (#2288 ) This PR reworks the validateEmail() function from the RA to allow timeouts during DNS validation of MX/A/AAAA records for an email to be non-fatal and match our intention to verify emails best-effort. Notes: bdns/problem.go - DNSError.Timeout() was changed to also include context cancellation and timeout as DNS timeouts. This matches what DNSError.Error() was doing to set the error message and supports external callers to Timeout not duplicating the work. bdns/mocks.go - the LookupMX mock was changed to support always.error and always.timeout in a manner similar to the LookupHost mock. Otherwise the TestValidateEmail unit test for the RA would fail when the MX lookup completed before the Host lookup because the error wouldn't be correct (empty DNS records vs a timeout or network error). test/config/ra.json, test/config-next/ra.json - the dnsTries and dnsTimeout values were updated such that dnsTries * dnsTimeout was <= the WFE->RA RPC timeout (currently 15s in the test configs). This allows the dns lookups to all timeout without the overall RPC timing out. Resolves #2260.	2016-10-27 11:56:12 -07:00
Jacob Hoffman-Andrews	e1bc1e5b29	Update config from config-next. (#2175 ) Set feature flags: "reuseValidAuthz": true, "authorizationLifetimeDays": 90, "pendingAuthorizationLifetimeDays": 7, "CAASERVFAILExceptions": "test/caa-servfail-exceptions.txt", "lookupIPV6": true, "allowAuthzDeactivation": true, Remove BaseURL. Remove trailing slash on CT log URL. All files now have trailing newlines.	2016-09-19 14:08:36 -07:00
Jacob Hoffman-Andrews	d75a44baa0	Remove "network" and "server" from syslog configs. (#2159 ) We removed these from the config object because we never use anything other than the default empty string, which means "local socket."	2016-09-08 10:08:18 -04:00
Ben Irving	c4f7fb580d	Split up boulder-config.json (RA) (#1974 ) Part of #1962	2016-06-29 13:43:55 -07:00

1 2

81 Commits