boulder

Commit Graph

Author	SHA1	Message	Date
Jacob Hoffman-Andrews	29724cb0b7	ocsp/responder: update Redis source to use live signing (#6207 ) This enables ocsp-responder to talk to the RA and request freshly signed OCSP responses. ocsp/responder/redis_source is moved to ocsp/responder/redis/redis_source.go and significantly modified. Instead of assuming a response is always available in Redis, it wraps a live-signing source. When a response is not available, it attempts a live signing. If live signing succeeds, the Redis responder returns the result right away and attempts to write a copy to Redis on a goroutine using a background context. To make things more efficient, I eliminate an unneeded ocsp.ParseResponse from the storage path. And I factored out a FakeResponse helper to make the unittests more manageable. Commits should be reviewable one-by-one. Fixes #6191	2022-07-18 10:47:14 -07:00
Aaron Gable	3000339dee	Reject CSRs with duplicate extensions (#6153 ) This behavior will be on by default in go1.19, so let's turn it on ourselves now to ensure there won't be any breakage when we upgrade in August.	2022-06-17 13:13:30 -07:00
Aaron Gable	11544756bb	Support new Google CT Policy (#6082 ) Add a new code path to the ctpolicy package which enforces Chrome's new CT Policy, which requires that SCTs come from logs run by two different operators, rather than one Google and one non-Google log. To achieve this, invert the "race" logic: rather than assuming we always have two groups, and racing the logs within each group against each other, we now race the various groups against each other, and pick just one arbitrary log from each group to attempt submission to. Ensure that the new code path does the right thing by adding a new zlint which checks that the two SCTs embedded in a certificate come from logs run by different operators. To support this lint, which needs to have a canonical mapping from logs to their operators, import the Chrome CT Log List JSON Schema and autogenerate Go structs from it so that we can parse a real CT Log List. Also add flags to all services which run these lints (the CA and cert-checker) to let them load a CT Log List from disk and provide it to the lint. Finally, since we now have the ability to load a CT Log List file anyway, use this capability to simplify configuration of the RA. Rather than listing all of the details for each log we're willing to submit to, simply list the names (technically, Descriptions) of each log, and look up the rest of the details from the log list file. To support this change, SRE will need to deploy log list files (the real Chrome log list for prod, and a custom log list for staging) and then update the configuration of the RA, CA, and cert-checker. Once that transition is complete, the deletion TODOs left behind by this change will be able to be completed, removing the old RA configuration and old ctpolicy race logic. Part of #5938	2022-05-25 15:14:57 -07:00
Aaron Gable	dab8a71b0e	Use new RA methods from WFE revocation path (#5983 ) Simplify the WFE `RevokeCertificate` API method in three ways: - Remove most of the logic checking if the requester is authorized to revoke the certificate in question (based on who is making the request, what authorizations they have, and what reason they're requesting). That checking is now done by the RA. Instead, simply verify that the JWS is authenticated. - Remove the hard-to-read `authorizedToRevoke` callbacks, and make the `revokeCertBySubscriberKey` (nee `revokeCertByKeyID`) and `revokeCertByCertKey` (nee `revokeCertByJWK`) helpers much more straight-line in their execution logic. - Call the RA's new `RevokeCertByApplicant` and `RevokeCertByKey` gRPC methods, rather than the deprecated `RevokeCertificateWithReg`. This change, without any flag flips, should be invisible to the end-user. It will slightly change some of our log message formats. However, by now relying on the new RA gRPC revocation methods, this change allows us to change our revocation policies by enabling the `AllowDoubleRevocation` and `MozRevocationReasons` feature flags, which affect the behavior of those new helpers. Fixes #5936	2022-03-28 14:14:11 -07:00
Aaron Gable	07d56e3772	Add new, simpler revocation methods to RA (#5969 ) Add two new gRPC methods to the SA: - `RevokeCertByKey` will be used when the API request was signed by the certificate's keypair, rather than a Subscriber keypair. If the request is for reason `keyCompromise`, it will ensure that the key is added to the blocked keys table, and will attempt to "re-revoke" a certificate that was already revoked for some other reason. - `RevokeCertByApplicant` supports both the path where the original subscriber or another account which has proven control over all of the identifier in the certificate requests revocation via the API. It does not allow the requested reason to be `keyCompromise`, as these requests do not represent a demonstration of key compromise. In addition, add a new feature flag `MozRevocationReasons` which controls the behavior of these new methods. If the flag is not set, they behave like they have historically (see above). If the flag is set to true, then the new methods enforce the upcoming Mozilla policies around revocation reasons, namely: - Only the original Subscriber can choose the revocation reason; other clients will get a set reason code based on the method of requesting revocation. When the original Subscriber requests reason `keyCompromise`, this request will be honored, but the key will not be blocked and other certificates with that key will not also be revoked. - Revocations signed with the certificate key will always get reason `keyCompromise`, because we do not know who is sending the request and therefore must assume that the use of the key in this way represents compromise. Because these requests will always be fore reason `keyCompromise`, they will always be added to the blocked keys table and they will always attempt "re-revocation". - Revocations authorized via control of all names in the cert will always get reason `cessationOfOperation`, which is to be used when the original Subscriber does not control all names in the certificate anymore. Finally, update the existing `AdministrativelyRevokeCertificate` method to use the new helper functions shared by the two new methods. Part of #5936	2022-03-14 08:58:17 -07:00
Aaron Gable	89000bd61c	Add close-primes detection via Fermat's factorization (#5853 ) Add a new check to GoodKey which attempts to factor the public modulus of the presented key using Fermat's factorization method. This method will succeed if and only if the prime factors are very close to each other -- i.e. almost certainly were not selected independently from a random uniform distribution, but were instead calculated via some other less secure method. To support this new feature, add a new config flag to the RA, CA, and WFE, which all use the GoodKey checks. As part of adding this new config value, refactor the GoodKey config items into their own config struct which can be re-used across all services. If the new `FermatRounds` config value has not been set, it will default to zero, causing no factorization to be attempted. Fixes #5850 Part of #5851	2021-12-14 09:19:33 -08:00
Jacob Hoffman-Andrews	ba0ea090b2	integration: save hierarchy across runs (#5729 ) This allows repeated runs using the same hiearchy, and avoids spurious errors from ocsp-updater saying "This CA doesn't have an issuer cert with ID XXX" Fixes #5721	2021-10-20 17:06:33 -07:00
Aaron Gable	4ef9fb1b4f	Add new SA.NewOrderAndAuthzs gRPC method (#5602 ) Add a new method to the SA's gRPC interface which takes both an Order and a list of new Authorizations to insert into the database, and adds both (as well as the various ancillary rows) inside a transaction. To enable this, add a new abstraction layer inside the `db/` package that facilitates inserting many rows at once, as we do for the `authz2`, `orderToAuthz2`, and `requestedNames` tables in this operation. Finally, add a new codepath to the RA (and a feature flag to control it) which uses this new SA method instead of separately calling the `NewAuthorization` method multiple times. Enable this feature flag in the config-next integration tests. This should reduce the failure rate of the new-order flow by reducing the number of database operations by coalescing multiple inserts into a single multi-row insert. It should also reduce the incidence of new authorizations being created in the database but then never exposed to the subscriber because of a failure later in the new-order flow, both by reducing failures overall and by adding those authorizations in a transaction which will be rolled back if there is a later failure. Fixes #5577	2021-09-03 13:48:04 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Jacob Hoffman-Andrews	b4e483d38b	Add gRPC MaxConnectionAge config. (#5311 ) This allows servers to tell clients to go away after some period of time, which triggers the clients to re-resolve DNS. Per grpc/grpc#12295, this is the preferred way to do this. Related: #5307.	2021-03-01 18:37:47 -08:00
Aaron Gable	16c7a21a57	RA: Multi-issuer support for OCSP purging (#5160 ) The RA is responsible for contacting Akamai to purge cached OCSP responses when a certificate is revoked and fresh OCSP responses need to be served ASAP. In order to do so, it needs to construct the same OCSP URLs that clients would construct, and that Akamai would cache. In order to do that, it needs access to the issuing certificate to compute a hash across its Subject Info and Public Key. Currently, the RA holds a single issuer certificate in memory, and uses that cert to compute all OCSP URLs, on the assumption that all certs we're being asked to revoke were issued by the same issuer. In order to support issuance from multiple intermediates at the same time (e.g. RSA and ECDSA), and to support rollover between different issuers of the same type (we may need to revoke certs issued by two different issuers for the 90 days in which their end-entity certs overlap), this commit changes the configuration to provide a list of issuer certificates instead. In order to support efficient lookup of issuer certs, this change also introduces a new concept, the Chain ID. The Chain ID is a truncated hash across the raw bytes of either the Issuer Info or the Subject Info of a given cert. As such, it can be used to confirm issuer/subject relationships between certificates. In the future, this may be a replacement for our current IssuerID (a truncated hash over the whole issuer certificate), but for now it is used to map revoked certs to their issuers inside the RA. Part of #5120	2020-11-06 13:58:32 -08:00
Aaron Gable	3666322817	Add health-checker tool and use it from startservers.py (#5095 ) This adds a new tool, `health-checker`, which is a client of the new Health Checker Service that has been integrated into all of our boulder components. This tool takes an address, a timeout, and a config file. It then attempts to connect to a gRPC Health Service at the given address, retrying until it hits its timeout, using credentials specified by the config file. This is then wrapped by a new function `waithealth` in our Python helpers, which serves much the same function as `waitport`, but specifically for services which surface a gRPC Health Service This in turn requires slight modifications to `startservers`, namely specifying the address and port on which each service starts its gRPC listener. Finally, this change also introduces new credentials for this health-checker, and adds those credentials as a valid client to all services' json configs. A similar change would have to be made to our production configs if we were to establish a long-lived health checker/prober in prod. Fixes #5074	2020-10-06 15:01:35 -07:00
Aaron Gable	440c5f96d9	Remove unreferenced values from test configs (#4959 )	2020-07-15 13:50:00 -07:00
Roland Bracewell Shoemaker	aa79d8360d	Move FasterNewOrdersRateLimit feature flag to the right test/config-next file (#4929 )	2020-07-02 14:03:38 -07:00
Aaron Gable	91d4e235ad	Deprecate the BlockedKeyTable feature flag (#4881 ) This commit consists of three classes of changes: 1) Changing various command main.go files to always behave as they would have when features.BlockedKeyTable was true. Also changing one test in the same manner. 2) Removing the BlockedKeyTable flag from configuration in config-next, because the flag is already live. 3) Moving the BlockedKeyTable flag to the "deprecated" section of features.go, and regenerating featureflag_strings.go. A future change will remove the BlockedKeyTable flag (and other similarly deprecated flags) from features.go entirely. Fixes #4873	2020-06-22 16:35:37 -07:00
Roland Bracewell Shoemaker	b7ad70caff	sa: implement faster new orders rate limit (#4857 ) Fixes #4840	2020-06-09 17:14:23 -07:00
Roland Bracewell Shoemaker	56898e8953	Log RSA key sizes in WFE/WFE2 and add feature to restrict them (#4839 ) Currently 99.99% of RSA keys we see in certificates at Let's Encrypt are either 2048, 3072, or 4096 bits, but we support every 8 bit increment between 2048 and 4096. Supporting these uncommon key sizes opens us up to having to block much larger ranges of keys when dealing with something like the Debian weak keys incident. Instead we should just reduce the set of key sizes we support down to what people actually use. Fixes #4835.	2020-06-08 11:23:27 -07:00
Roland Bracewell Shoemaker	7673f02803	Use cmd/ceremony in integration tests (#4832 ) This ended up taking a lot more work than I expected. In order to make the implementation more robust a bunch of stuff we previously relied on has been ripped out in order to reduce unnecessary complexity (I think I insisted on a bunch of this in the first place, so glad I can kill it now). In particular this change: * Removes bhsm and pkcs11-proxy: softhsm and pkcs11-proxy don't play well together, and any softhsm manipulation would need to happen on bhsm, then require a restart of pkcs11-proxy to pull in the on-disk changes. This makes manipulating softhsm from the boulder container extremely difficult, and because of the need to initialize new on each run (described below) we need direct access to the softhsm2 tools since pkcs11-tool cannot do slot initialization operations over the wire. I originally argued for bhsm as a way to mimic a network attached HSM, mainly so that we could do network level fault testing. In reality we've never actually done this, and the extra complexity is not really realistic for a handful of reasons. It seems better to just rip it out and operate directly on a local softhsm instance (the other option would be to use pkcs11-proxy locally, but this still would require manually restarting the proxy whenever softhsm2-util was used, and wouldn't really offer any realistic benefit). * Initializes the softhsm slots on each integration test run, rather than when creating the docker image (this is necessary to prevent churn in test/cert-ceremonies/generate.go, which would need to be updated to reflect the new slot IDs each time a new boulder-tools image was created since slot IDs are randomly generated) * Installs softhsm from source so that we can use a more up to date version (2.5.0 vs. 2.2.0 which is in the debian repo) * Generates the root and intermediate private keys in softhsm and writes out the root and intermediate public keys to /tmp for use in integration tests (the existing test-{ca,root} certs are kept in test/ because they are used in a whole bunch of unit tests. At some point these should probably be renamed/moved to be more representative of what they are used for, but that is left for a follow-up in order to keep the churn in this PR as related to the ceremony work as possible) Another follow-up item here is that we should really be zeroing out the database at the start of each integration test run, since certain things like certificates and ocsp responses will be signed by a key/issuer that is no longer is use/doesn't match the current key/issuer. Fixes #4832.	2020-06-03 15:20:23 -07:00
Roland Bracewell Shoemaker	70ff4d9347	Add bad-key-revoker daemon (#4788 ) Adds a daemon which monitors the new blockedKeys table and checks for any unexpired, unrevoked certificates that are associated with the added SPKI hashes and revokes them, notifying the user that issued the certificates. Fixes #4772.	2020-04-23 11:51:59 -07:00
Jacob Hoffman-Andrews	87fb6028c1	Add log validator to integration tests (#4782 ) For now this mainly provides an example config and confirms that log-validator can start up and shut down cleanly, as well as provide a stat indicating how many log lines it has handled. This introduces a syslog config to the boulder-tools image that will write logs to /var/log/program.log. It also tweaks the various .json config files so they have non-default syslogLevel, to ensure they actually write something for log-validator to verify.	2020-04-20 13:33:42 -07:00
Roland Bracewell Shoemaker	9df97cbf06	Add a blocked keys table, and use it (#4773 ) Fixes #4712 and fixes #4711.	2020-04-15 13:42:51 -07:00
Roland Bracewell Shoemaker	ea231adc36	features: remove deprecated feature flags (#4607 ) Confirmed none of these features are currently present in any staging or production configs.	2019-12-09 15:59:27 -05:00
Daniel McCarney	fde145ab96	RA: implement stricter email validation. (#4574 ) Prev. we weren't checking the domain portion of an email contact address very strictly in the RA. This updates the PA to export a function that can be used to validate the domain the same way we validate domain portions of DNS type identifiers for issuance. This also changes the RA to use the `invalidEmail` error type in more places. A new Go integration test is added that checks these errors end-to-end for both account creation and account update.	2019-11-22 13:39:31 -05:00
Roland Bracewell Shoemaker	46e0468220	Make authz2 the default storage format (#4476 ) This change set makes the authz2 storage format the default format. It removes most of the functionality related to the previous storage format, except for the SA fallbacks and old gRPC methods which have been left for a follow-up change in order to make these changes deployable without introducing incompatibilities. Fixes #4454.	2019-10-21 15:29:15 -04:00
Daniel McCarney	f02e9da38f	Support admin. blocking public keys. (#4419 ) We occasionally have reason to block public keys from being used in CSRs or for JWKs. This work adds support for loading a YAML blocked keys list to the WFE, the RA and the CA (all the components already using the `goodekey` package). The list is loaded in-memory and is intended to be used sparingly and not for more complicated mass blocking scenarios. This augments the existing debian weak key checking which is specific to RSA keys and operates on a truncated hash of the key modulus. In comparison the admin. blocked keys are identified by the Base64 encoding of a SHA256 hash over the DER encoding of the public key expressed as a PKIX subject public key. For ECDSA keys in particular we believe a more thorough solution would have to consider inverted curve points but to start we're calling this approach "Good Enough". A utility program (`block-a-key`) is provided that can read a PEM formatted x509 certificate or a JSON formatted JWK and emit lines to be added to the blocked keys YAML to block the related public key. A test blocked keys YAML file is included (`test/example-blocked-keys.yml`), initially populated with a few of the keys from the `test/` directory. We may want to do a more through pass through Boulder's source code and add a block entry for every test private key. Resolves https://github.com/letsencrypt/boulder/issues/4404	2019-09-06 16:54:26 -04:00
Roland Bracewell Shoemaker	62e52f4103	test: weakKeyDirectory -> weakKeyFile in test configs (#4397 )	2019-08-12 18:07:56 -04:00
Roland Bracewell Shoemaker	acc44498d1	RA: Make RevokeAtRA feature standard behavior (#4268 ) Now that it is live in production and is working as intended we can remove the old ocsp-updater functionality entirely. Fixes #4048.	2019-06-20 14:32:53 -04:00
Roland Bracewell Shoemaker	3532dce246	Excise grpc maxConcurrentStreams configuration (#4257 )	2019-06-12 09:35:24 -04:00
Roland Bracewell Shoemaker	4d40cf58e4	Enable integration tests for authz2 and fix a few bugs (#4221 ) Enables integration tests for authz2 and fixes a few bugs that were flagged up during the process. Disables expired-authorization-purger integration tests if config-next is being used as expired-authz-purger expects to purge some stuff but doesn't know about authz2 authorizations, a new test will be added with #4188. Fixes #4079.	2019-05-23 15:06:50 -07:00
Jacob Hoffman-Andrews	0c700143bb	Clean up README and test configs (#4185 ) - docker-rebuild isn't needed now that boulder and bhsm containers run directly off the boulder-tools image. - Remove DNS options from RA config. - Remove GSB options from VA config.	2019-04-30 13:26:19 -07:00
Daniel McCarney	748f315b1a	PA: Support YAML for hostname policy. (#4180 ) Our existing hostname policy configs use JSON. We would like to switch to YAML to match the rate limit policy configs and to support commenting/tagging entries in the hostname policy. The PA is updated to support both JSON and YAML while we migrate the existing policy data to the new format. To verify there are no changes in functionality the existing unit tests were updated to test the same policy expressed in YAML and JSON form. In integration tests the `config` tree continues to use a JSON hostname policy file to test the legacy support. In `config-next` a YAML hostname policy file is used instead. The new YAML format allows separating out `HighRiskBlockedNames` (primarily used to meet the requirement of managing issuance for "high risk" domains per CABF BRs, mostly static) and `AdminBlockedNames` (used after administrative action is taken to block a domain. Additions are made with some frequency). Since the rate at which we change entries in these lists differs it is helpful to separate them in the policy content.	2019-04-26 14:35:28 -04:00
Jacob Hoffman-Andrews	8f578f3a93	Improve integration tests (#4143 ) - Move fakeclock, get_future_output, and random_domain to helpers.py. - Remove tempdir handling from integration-test.py since it's already done in helpers.py - Consolidate handling of config dir into helpers.py, and add CONFIG_NEXT boolean. - Move RevokeAtRA config gating into verify_revocation to reduce redundancy. - Skip load-balancing test when filter is enabled. - Ungate test_sct_embedding - Rework test_ct_submissions, which was out of date. In particular, have a couple of logs where submitFinalCert: false, and make ct-test-srv store submission counts by hostnames for better test case isolation.	2019-04-04 10:59:38 -07:00
Jacob Hoffman-Andrews	d1e6d0f190	Remove TLS-SNI-01 (#4114 ) * Remove the challenge whitelist * Reduce the signature for ChallengesFor and ChallengeTypeEnabled * Some unit tests in the VA were changed from testing TLS-SNI to testing the same behavior in TLS-ALPN, when that behavior wasn't already tested. For instance timeouts during connect are now tested. Fixes #4109	2019-03-15 09:05:24 -04:00
Daniel McCarney	279947ade2	CI/Devenv: restore 20s RA->VA timeout. (#4084 ) I tried dropping the RA->VA timeout to make the `test_http_challenge_timeout` integration test faster. It seems to flake in CI so I'm restoring the original 20s timeout. This makes `test_http_challenge_timeout` slower but c'est la vie.	2019-02-22 08:53:18 -08:00
Daniel McCarney	16e464a37d	RA: apply certificate rate limits at NewOrder time. (#4074 ) If an order for a given set of names will fail finalization because of certificate rate limits (certs per domain, certs per fqdn set) there isn't any point in allowing an order for those names to be created. We can stop a lot of requests earlier by enforcing the cert rate limits at new order time as well as finalization time. A new RA `EarlyOrderRateLimit` feature flag controls whether this is done or not. Resolves #3975	2019-02-21 11:02:40 -08:00
Daniel McCarney	3324989205	CI/Dev: Increase RA->VA timeout to 8s. (#4062 ) There has been some flakyness in CI related to RA->VA timeouts.	2019-02-15 13:38:12 -08:00
Roland Bracewell Shoemaker	3e54cea295	Implement direct revocation at RA (#4043 ) Implements a feature that enables immediate revocation instead of marking a certificate revoked and waiting for the OCSP-Updater to generate the OCSP response. This means that as soon as the request returns from the WFE the revoked OCSP response should be available to the user. This feature requires that the RA be configured to use the standalone Akamai purger service. Fixes #4031.	2019-02-14 14:47:42 -05:00
Daniel McCarney	1c0be52e53	VA: Add integration test for HTTP timeouts. (#4050 ) Also update `TestHTTPTimeout` to test with the `SimplifiedVAHTTP` feature flag enabled.	2019-02-12 13:42:01 -08:00
Jacob Hoffman-Andrews	92e8e1708a	Update config and config-next challenge settings. (#4017 ) - Allow tls-alpn-01 challenge in config. - Disallow tls-sni-01 challenge in config-next. - Remove gating of tls-alpn integration test. - Remove TLSSNIRevalidation in config-next.	2019-01-18 10:30:38 -08:00
Roland Bracewell Shoemaker	842739bccd	Remove deprecated features that have been purged from prod and staging configs (#4001 )	2019-01-15 16:16:35 -08:00
Roland Bracewell Shoemaker	196f019851	Add support for temporal CT logs (#3853 ) Required a little bit of rework of the RA issuance flow (to add parsing of the precert to determine the expiration date, and moving final cert parsing before final cert submission) and RA tests, but I think it shouldn't create any issues... Fixes #3197.	2018-09-14 16:14:42 -07:00
Daniel McCarney	d39babdcf3	RA: Remove vestigial DNS config/setup. (#3854 ) In `db01b0b` we removed email validation from the RA. This was the only use of the `bdns` package by the RA and so we can go one step further and delete the remaining setup, configuration and `bdns` fields.	2018-09-13 13:39:23 -04:00
Roland Bracewell Shoemaker	876c727b6f	Update gRPC (#3817 ) Fixes #3474.	2018-08-20 10:55:42 -04:00
Jacob Hoffman-Andrews	36a83150ad	Add stagger to CT log submissions. (#3794 ) This allows each log a chance to respond before we move onto the next, spreading our load more evenly across the logs in a log group.	2018-07-06 16:25:51 -04:00
Roland Bracewell Shoemaker	e27f370fd3	Excise code relating to pre-SCT embedding issuance flow (#3769 ) Things removed: * features.EmbedSCTs (and all the associated RA/CA/ocsp-updater code etc) * ca.enablePrecertificateFlow (and all the associated RA/CA code) * sa.AddSCTReceipt and sa.GetSCTReceipt RPCs * publisher.SubmitToCT and publisher.SubmitToSingleCT RPCs Fixes #3755.	2018-06-28 08:33:05 -04:00
Maciej Dębski	bb9ddb124e	Implement TLS-ALPN-01 and integration test for it (#3654 ) This implements newly proposed TLS-ALPN-01 validation method, as described in https://tools.ietf.org/html/draft-ietf-acme-tls-alpn-01 This challenge type is disabled except in the config-next tree.	2018-06-06 13:04:09 -04:00
Jacob Hoffman-Andrews	dbcb16543e	Start using multiple-IP hostnames for load balancing (#3687 ) We'd like to start using the DNS load balancer in the latest version of gRPC. That means putting all IPs for a service under a single hostname (or using a SRV record, but we're not taking that path). This change adds an sd-test-srv to act as our service discovery DNS service. It returns both Boulder IP addresses for any A lookup ending in ".boulder". This change also sets up the Docker DNS for our boulder container to defer to sd-test-srv when it doesn't know an answer. sd-test-srv doesn't know how to resolve public Internet names like `github.com`. Resolving public names is required for the `godep-restore` test phase, so this change breaks out a copy of the boulder container that is used only for `godep-restore`. This change implements a shim of a DNS resolver for gRPC, so that we can switch to DNS-based load balancing with the currently vendored gRPC, then when we upgrade to the latest gRPC we won't need a simultaneous config update. Also, this change introduces a check at the end of the integration test that each backend received at least one RPC, ensuring that we are not sending all load to a single backend.	2018-05-23 09:47:14 -04:00
Daniel McCarney	76a3f4a18f	RA/CA: Use `doNotForceCN: false` for `test/config`. (#3698 ) In staging/prod we use `doNotForceCN: false` for both the RA & CA config. Switching this to `true` is blocked on CABF work that will likely take considerable time. In the short-term we should use `doNotForceCN: false` in `test/config` and only use `doNotForceCN: true` in `test/config-next`.	2018-05-09 12:54:16 -07:00
Jacob Hoffman-Andrews	a4421ae75b	Run gRPC backends on multiple IPs instead of multiple ports (#3679 ) We're currently stuck on gRPC v1.1 because of a breaking change to certificate validation in gRPC 1.8. Our gRPC balancer uses a static list of multiple hostnames, and expects to validate against those hostnames. However gRPC expects that a service is one hostname, with multiple IP addresses, and validates all those IP addresses against the same hostname. See grpc/grpc-go#2012. If we follow gRPC's assumptions, we can rip out our custom Balancer and custom TransportCredentials, and will probably have a lower-friction time in general. This PR is the first step in doing so. In order to satisfy the "multiple IPs, one port" property of gRPC backends in our Docker container infrastructure, we switch to Docker's user-defined networking. This allows us to give the Boulder container multiple IP addresses on different local networks, and gives it different DNS aliases in each network. In startservers.py, each shard of a service listens on a different DNS alias for that service, and therefore a different IP address. The listening port for each shard of a service is now identical. This change also updates the gRPC service certificates. Now, each certificate that is used in a gRPC service (as opposed to something that is "only" a client) has three names. For instance, sa1.boulder, sa2.boulder, and sa.boulder (the generic service name). For now, we are validating against the specific hostnames. When we update our gRPC dependency, we will begin validating against the generic service name. Incidentally, the DNS aliases feature of Docker allows us to get rid of some hackery in entrypoint.sh that inserted entries into /etc/hosts. Note: Boulder now has a dependency on the DNS aliases feature in Docker. By default, docker-compose run creates a temporary container and doesn't assign any aliases to it. We now need to specify docker-compose run --use-aliases to get the correct behavior. Without --use-aliases, Boulder won't be able to resolve the hostnames it wants to bind to.	2018-05-07 10:38:31 -07:00
Roland Bracewell Shoemaker	0a86573a73	Update integration tests	2018-04-20 13:18:40 -07:00
Roland Bracewell Shoemaker	ebc86fd778	Fix CT submission cancelations (#3658 ) When the WFE calls the RA the RA creates a sub context which is cancelled when the RPC returns. Because we were spawning the publisher RPC calls in a goroutine with the context from ra.IssueCertificate as soon as ra.IssueCertificate returned that context was being canceled which in turn canceled the publisher RPC calls. Instead of using the RA RPC context simply use a `context.Background()` so that the RPC context doesn't break these submissions. Also return to pre-features.CancelCTSubmissions behavior where precert submissions would be canceled once we retrieved SCTs from the winning logs instead of relying on the magic behavior of the RA RPC canceling them itself.	2018-04-20 11:26:02 -04:00
Roland Bracewell Shoemaker	1271a15be7	Submit final certs to CT logs (#3640 ) Submits final certificates to any configured CT logs. This doesn't introduce a feature flag as it is config gated, any log we want to submit final certificates to needs to have it's log description updated to include the `"submitFinalCerts": true` field. Fixes #3605.	2018-04-13 12:02:01 -04:00
Jacob Hoffman-Andrews	2a1cd4981a	Allow configuring gRPC's MaxConcurrentStreams (#3642 ) During periods of peak load, some RPCs are significantly delayed (on the order of seconds) by client-side blocking. HTTP/2 clients have to obey a "max concurrent streams" setting sent by the server. In Go's HTTP/2 implementation, this value [defaults to 250](https://github.com/golang/net/blob/master/http2/server.go#L56), so the gRPC default is also 250. So whenever there are more than 250 requests in progress at a time, additional requests will be delayed until there is a slot available. During this peak load, we aren't hitting limits on CPU or memory, so we should increase the max concurrent streams limit to take better advantage of our available resources. This PR adds a config field to do that. Fixes #3641.	2018-04-12 17:17:17 -04:00
Jacob Hoffman-Andrews	a4f9de9e35	Improve nesting of RPC deadlines (#3619 ) gRPC passes deadline information through the RPC boundary, but client and server have the same deadline. Ideally we'd like the server to have a slightly tighter deadline than the client, so if one of the server's onward RPCs or other network calls times out, the server can pass back more detailed information to the client, rather than the client timing out the server and losing the opportunity to log more detailed information about which component caused the timeout. In this change, I subtract 100ms from the deadline on the server side of our interceptors, using our existing serverInterceptor. I also check that there is at least 100ms remaining in which to do useful work, so the server doesn't begin a potentially expensive task only to abort it. Fixes #3608.	2018-04-06 15:40:18 +01:00
Roland Bracewell Shoemaker	cc5ec34539	Allow configuration of multiple DNS resolvers (#3612 ) * Allow configuration of multiple DNS resolvers * Use multiple DNS resolvers in integration tests Fixes #3611.	2018-04-05 11:51:22 -04:00
Jacob Hoffman-Andrews	700604dda1	Overlapping wildcard errors are 400, not 500. (#3561 ) Return a malformed error for these requests. Also add an integration test. Fixes #3558	2018-03-14 13:18:25 -07:00
Roland Bracewell Shoemaker	9c9e944759	Add SCT embedding (#3521 ) Adds SCT embedding to the certificate issuance flow. When a issuance is requested a precertificate (the requested certificate but poisoned with the critical CT extension) is issued and submitted to the required CT logs. Once the SCTs for the precertificate have been collected a new certificate is issued with the poison extension replace with a SCT list extension containing the retrieved SCTs. Fixes #2244, fixes #3492 and fixes #3429.	2018-03-12 11:58:30 -07:00
Jacob Hoffman-Andrews	11434650b7	Check safe browsing at validation time (#3539 ) Right now we check safe browsing at new-authz time, which introduces a possible external dependency when calling new-authz. This is usually fine, since most safe browsing checks can be satisfied locally, but when requests have to go external, it can create variance in new-authz timing. Fixes #3491.	2018-03-09 11:15:05 +00:00
Daniel McCarney	28cc969814	Remove TLS-SNI-02 implementation. (#3516 ) This code was never enabled in production. Our original intent was to ship this as part of the ACMEv2 API. Before that could happen flaws were identified in TLS-SNI-01\|02 that resulted in TLS-SNI-02 being removed from the ACME protocol. We won't ever be enabling this code and so we might as well remove it.	2018-03-02 10:56:13 -08:00
Roland Bracewell Shoemaker	0b53063a72	ctpolicy: Add informational logs and don't cancel remaining submissions (#3472 ) Add a set of logs which will be submitted to but not relied on for their SCTs, this allows us to test submissions to a particular log or submit to a log which is not yet approved by a browser/root program. Also add a feature which stops cancellations of remaining submissions when racing to get a SCT from a group of logs. Additionally add an informational log that always times out in config-next. Fixes #3464 and fixes #3465.	2018-02-23 21:51:50 -05:00
Jacob Hoffman-Andrews	92c9340fe8	Deprecate CountCertificatesExact. (#3462 ) This is now enabled in prod and we can make it the default.	2018-02-20 14:34:03 -08:00
Roland Bracewell Shoemaker	8446571b46	Remove EnforceChallengeDisable (#3444 ) Removes usage of the `EnforceChallengeDisable` feature, the feature itself is not removed as it is still configured in staging/production, once that is fixed I'll submit another PR removing the actual flag. This keeps the behavior that when authorizations are retrieved from the SA they have their challenges populated, because that seems to make the most sense to me? It also retains TLS re-validation. Fixes #3441.	2018-02-14 13:21:26 -08:00
Jacob Hoffman-Andrews	c556a1a20d	Reduce spurious errors in integration test (#3436 ) Boulder is fairly noisy about gRPC connection errors. This is a mixed blessing: Our gRPC configuration will try to reconnect until it hits an RPC deadline, and most likely eventually succeed. In that case, we don't consider those to really be errors. However, in cases where a connection is repeatedly failing, we'd like to see errors in the logs about connection failure, rather than "deadline exceeded." So we want to keep logging of gRPC errors. However, right now we get a lot of these errors logged during integration tests. They make the output hard to read, and may disguise more serious errors. So we'd like to avoid causing such errors in normal integration test operation. This change reorders the startup of Boulder components by their gRPC dependencies, so everything's backend is likely to be up and running before it starts. It also reverses that order for clean shutdowns, and waits for each process to exit before signalling the next one. With these changes, I still got connection errors. Taking listenbuddy out of the gRPC path fixed them. I believe the issue is that listenbuddy is not a truly transparent proxy. In particular, it accepts an inbound TCP connection before opening an outbound TCP connection. If opening that outbound connection results in "connection refused," it closes the inbound connection. That means gRPC sees a "connection closed" (or "connection reset"?) rather than "connection refused". I'm guessing it handles those cases differently, explaining the different error results. We've been using listenbuddy to trigger disconnects while Boulder is running, to ensure that gRPC's reconnect code works. I think we can probably rely on gRPC's reconnect to work. The initial problem that led us to start testing this was a configuration problem; now that we have the configuration we want, we should be fine and don't need to keep testing reconnects on every integration test run.	2018-02-12 18:17:50 -08:00
Jacob Hoffman-Andrews	2dc3b56fa9	Add variable latency to ct-test-srv (#3435 ) For the upcoming SCT embedding changes, it will be useful to have a CT test server that blocks for nontrivial amounts of time before responding. This change introduces a config file for `ct-test-srv` that can be used to set up multiple "personalities" on various ports. Each personality can have a "latencySchedule" that determines how long it will sleep before servicing responding to a submission. This change also introduces two new "personalities" on :4510 and :4511, plus configures CTLogGroups in the RA. Having four CT log personalities allows us to simulate two nontrivial log groups. Note: This triggers Publisher to emit audit errors on timed-out submissions. We may want to make Publisher not treat those as errors, and instead only log an error if a whole log group fails.	2018-02-09 13:48:19 -08:00
Roland Bracewell Shoemaker	fc5c8f76b6	Remove unused features (#3393 ) This removes a number of unused features (i.e. they are never checked anywhere).	2018-01-25 08:55:05 -05:00
Daniel McCarney	c6d56b7a84	Match RA `authorizationLifetimeDays` to prod. (#3370 )	2018-01-16 10:39:57 -08:00
Jacob Hoffman-Andrews	8153b919be	Implement TLSSNIRevalidation (#3361 ) This change adds a feature flag, TLSSNIRevalidation. When it is enabled, Boulder will create new authorization objects with TLS-SNI challenges if the requesting account has issued a certificate with the relevant domain name, and was the most recent account to do so. This setting overrides the configured list of challenges in the PolicyAuthority, so even if TLS-SNI is disabled in general, it will be enabled for revalidation. Note that this interacts with EnforceChallengeDisable. Because EnforceChallengeDisable causes additional checked at validation time and at issuance time, we need to update those two places as well. We'll send a follow-up PR with that. We chose to make this work only for the most recent account to issue, even if there were overlapping certificates, because it significantly simplifies the database access patterns and should work for 95+% of cases. Note that this change will let an account revalidate and reissue for a domain even if the previous issuance on that account used http-01 or dns-01. This also simplifies implementation, and fits within the intent of the mitigation plan: If someone previously issued for a domain using http-01, we have high confidence that they are actually the owner, and they are not going to "steal" the domain from themselves using tls-sni-01. Also note: This change also doesn't work properly with ReusePendingAuthz: true. Specifically, if you attempted issuance in the last couple days and failed because there was no tls-sni challenge, you'll still have an http-01 challenge lying around, and we'll reuse that; then your client will fail due to lack of tls-sni challenge again. This change was joint work between @rolandshoemaker and @jsha.	2018-01-12 11:00:06 -08:00
Maciej Dębski	44984cd84a	Implement regID whitelist for allowed challenge types. (#3352 ) This updates the PA component to allow authorization challenge types that are globally disabled if the account ID owning the authorization is on a configured whitelist for that challenge type.	2018-01-10 13:44:53 -05:00
Roland Shoemaker	1a3a76438c	Fix tests and GetOrderAuthorizations	2018-01-09 20:38:52 -08:00
Jacob Hoffman-Andrews	cd49316493	Do ROCAChecks by default. (#3283 ) This feature flag has been enabled in prod, and we don't expect to want to turn it off any time soon.	2017-12-15 13:44:39 -08:00
Jacob Hoffman-Andrews	f16c3af335	Active UDPDNS by default. (#3285 ) Active UDPDNS by default	2017-12-15 12:26:45 -08:00
Daniel McCarney	1c99f91733	Policy based issuance for wildcard identifiers (Round two) (#3252 ) This PR implements issuance for wildcard names in the V2 order flow. By policy, pending authorizations for wildcard names only receive a DNS-01 challenge for the base domain. We do not re-use authorizations for the base domain that do not come from a previous wildcard issuance (e.g. a normal authorization for example.com turned valid by way of a DNS-01 challenge will not be reused for a .example.com order). The wildcard prefix is stripped off of the authorization identifier value in two places: When presenting the authorization to the user - ACME forbids having a wildcard character in an authorization identifier. When performing validation - We validate the base domain name without the . prefix. This PR is largely a rewrite/extension of #3231. Instead of using a pseudo-challenge-type (DNS-01-Wildcard) to indicate an authorization & identifier correspond to the base name of a wildcard order name we instead allow the identifier to take the wildcard order name with the *. prefix.	2017-12-04 12:18:10 -08:00
Daniel McCarney	55dd1020c0	Increase VA SingleDialTimeout to 10s. (#3260 ) This PR changes the VA's singleDialTimeout value from 5 * time.Second to 10 * time.Second. This will give slower servers a better chance to respond, especially for the multi-VA case where n requests arrive ~simultaneously. This PR also bumps the RA->VA timeout by 5s and the WFE->RA timeout by 5s to accommodate the increased dial timeout. I put this in a separate commit in case we'd rather deal with this separately.	2017-12-04 09:53:26 -08:00
Roland Bracewell Shoemaker	d5db80ab12	Various publisher CT fixes (#3219 ) Makes a couple of changes: * Change `SubmitToCT` to make submissions to each log in parallel instead of in serial, this prevents a single slow log from eating up the majority of the deadline and causing submissions to other logs to fail * Remove the 'submissionTimeout' field on the publisher since it is actually bounded by the gRPC timeout as is misleading * Add a timeout to the CT clients internal HTTP client so that when log servers hang indefinitely we actually do retries instead of just using the entire submission deadline. Currently set at 2.5 minutes Fixes #3218.	2017-11-09 10:05:26 -05:00
Jacob Hoffman-Andrews	5df083a57e	Add ROCA weak key checking (#3189 ) Thanks to @titanous for the library!	2017-11-02 08:42:59 -04:00
Jacob Hoffman-Andrews	4e68fb2ff6	Switch to udp for internal DNS. (#3135 ) We used to use TCP because we would request DNSSEC records from Unbound, and they would always cause truncated records when present. Now that we no longer request those (#2718), we can use UDP. This is better because the TCP serving paths in Unbound are likely less thoroughly tested, and not optimized for high load. In particular this may resolve some availability problems we've seen recently when trying to upgrade to a more recent Unbound. Note that this only affects the Boulder->Unbound path. The Unbound->upstream path is already UDP by default (with TCP fallback for truncated ANSWERs).	2017-10-10 10:06:33 -04:00
Jacob Hoffman-Andrews	b0c7bc1bee	Recheck CAA for authorizations older than 8 hours (#3014 ) Fixes #2889. VA now implements two gRPC services: VA and CAA. These both run on the same port, but this allows implementation of the IsCAAValid RPC to skip using the gRPC wrappers, and makes it easier to potentially separate the service into its own package in the future. RA.NewCertificate now checks the expiration times of authorizations, and will call out to VA to recheck CAA for those authorizations that were not validated recently enough.	2017-08-28 16:40:57 -07:00
Roland Bracewell Shoemaker	90ba766af9	Add NewOrder RPCs + methods to SA and RA (#2907 ) Fixes #2875, #2900 and #2901.	2017-08-11 14:24:25 -04:00
Kleber Correia	338c61171b	Remove IDNASupport flag (#2926 ) Splitting #2712 into multiple per-flag PRs	2017-08-01 16:51:19 -07:00
Jacob Hoffman-Andrews	3431acfb92	Adjust testing maxNames config to match prod. (#2911 )	2017-07-27 15:23:29 -07:00
Jacob Hoffman-Andrews	8bc1db742c	Improve recycling of pending authzs (#2896 ) The existing ReusePendingAuthz implementation had some bugs: It would recycle deactivated authorizations, which then couldn't be fulfilled. (#2840) Since it was implemented in the SA, it wouldn't get called until after the RA checks the Pending Authorizations rate limit. Which means it wouldn't fulfill its intended purpose of making accounts less likely to get stuck in a Pending Authorizations limited state. (#2831) This factors out the reuse functionality, which used to be inside an "if" statement in the SA. Now the SA has an explicit GetPendingAuthorization RPC, which gets called from the RA before calling NewPendingAuthorization. This happens to obsolete #2807, by putting the recycling logic for both valid and pending authorizations in the RA.	2017-07-26 14:00:30 -07:00
Roland Bracewell Shoemaker	8ce2f8b432	Basic RSA known weak key checking (#2765 ) Adds a basic truncated modulus hash check for RSA keys that can be used to check keys against the Debian `{openssl,openssh,openvpn}-blacklist` lists of weak keys generated during the [Debian weak key incident](https://wiki.debian.org/SSLkeys). Testing is gated on adding a new configuration key to the WFE, RA, and CA configs which contains the path to a directory which should contain the weak key lists. Fixes #157.	2017-05-25 09:33:58 -07:00
Jacob Hoffman-Andrews	b17b5c72a6	Remove statsd from Boulder (#2752 ) This removes the config and code to output to statsd. - Change `cmd.StatsAndLogging` to output a `Scope`, not a `Statter`. - Remove the prefixing of component name (e.g. "VA") in front of stats; this was stripped by `autoProm` but now no longer needs to be. - Delete vendored statsd client. - Delete `MockStatter` (generated by gomock) and `mocks.Statter` (hand generated) in favor of mocking `metrics.Scope`, which is the interface we now use everywhere. - Remove a few unused methods on `metrics.Scope`, and update its generated mock. - Refactor `autoProm` and add `autoRegisterer`, which can be included in a `metrics.Scope`, avoiding global state. `autoProm` now registers everything with the `prometheus.Registerer` it is given. - Change va_test.go's `setup()` to not return a stats object; instead the individual tests that care about stats override `va.stats` directly. Fixes #2639, #2733.	2017-05-15 10:19:54 -04:00
Daniel McCarney	1ed34a4a5d	Fixes cert count rate limit for exact PSL matches. (#2703 ) Prior to this PR if a domain was an exact match to a public suffix list entry the certificates per name rate limit was applied based on the count of certificates issued for that exact name and all of its subdomains. This PR introduces an exception such that exact public suffix matches correctly have the certificate per name rate limit applied based on only exact name matches. In order to accomplish this a new RPC is added to the SA `CountCertificatesByExactNames`. This operates similar to the existing `CountCertificatesByNames` but does not include subdomains in the count, only exact matches to the names provided. The usage of this new RPC is feature flag gated behind the "CountCertificatesExact" feature flag. The RA unit tests are updated to test the new code paths both with and without the feature flag enabled. Resolves #2681	2017-05-02 13:43:35 -07:00
Roland Bracewell Shoemaker	a46d30945c	Purge remaining AMQP code (#2648 ) Deletes github.com/streadway/amqp and the various RabbitMQ setup tools etc. Changes how listenbuddy is used to proxy all of the gRPC client -> server connections so we test reconnection logic. +49 -8,221 😁 Fixes #2640 and #2562.	2017-04-04 15:02:22 -07:00
Jacob Hoffman-Andrews	6719dc17a6	Remove AMQP config and code (#2634 ) We now use gRPC everywhere.	2017-04-03 10:39:39 -04:00
David Calavera	c71c3cff80	Implement TLS-SNI-02 challenge validations. (#2585 ) I think these are all the necessary changes to implement TLS-SNI-02 validations, according to the section 7.3 of draft 05: https://tools.ietf.org/html/draft-ietf-acme-acme-05#section-7.3 I don't have much experience with this code, I'll really appreciate your feedback. Signed-off-by: David Calavera <david.calavera@gmail.com>	2017-03-22 10:17:59 -07:00
Jacob Hoffman-Andrews	cbde78d58f	Harmonize and tweak configs (#2479 ) Set authorizationLifetimeDays to 60 across both config and config-next. Set NumSessions to 2 in both config and config-next. A decrease from 10 because pkcs11-proxy (or pkcs11-daemon?) seems to error out under load if you have more sessions than CPUs. Reorder parallelGenerateOCSPRequests to match config-next. Remove extra tags for parsing yaml in config objects.	2017-01-10 13:46:38 -08:00
Jacob Hoffman-Andrews	510e279208	Simplify gRPC TLS configs. (#2470 ) Previously, a given binary would have three TLS config fields (CA cert, cert, key) for its gRPC server, plus each of its configured gRPC clients. In typical use, we expect all three of those to be the same across both servers and clients within a given binary. This change reuses the TLSConfig type already defined for use with AMQP, adds a Load() convenience function that turns it into a *tls.Config, and configures it for use with all of the binaries. This should make configuration easier and more robust, since it more closely matches usage. This change preserves temporary backwards-compatibility for the ocsp-updater->publisher RPCs, since those are the only instances of gRPC currently enabled in production.	2017-01-06 14:19:18 -08:00
Jacob Hoffman-Andrews	089a270453	Add instructions on load testing OCSP generation. (#2459 )	2017-01-02 11:36:03 -08:00
Jacob Hoffman-Andrews	0c665b2053	Split up gRPC certificates by service. (#2453 ) Previously, all gRPC services used the same client and server certificates. Now, each service has its own certificate, which it uses for both client and server authentication, more closely simulating production. This also adds aliases for each of the relevant hostnames in /etc/hosts. There may be some issues if Docker decides to rewrite /etc/hosts while Boulder is running, but this seems to work for now.	2016-12-29 14:53:59 -08:00
Jacob Hoffman-Andrews	1c1449b284	Improvements to tests and test configs. (#2396 ) - Remove spinner from test.js. It made Travis logs hard to read. - Listen on all interfaces for debugAddr. This makes it possible to check Prometheus metrics for instances running in a Docker container. - Standardize DNS timeouts on 1s and 3 retries across all configs. This ensures DNS completes within the relevant RPC timeouts. - Remove RA service queue from VA, since VA no longer uses the callback to RA on completing a challenge.	2016-12-05 14:35:27 -08:00
Roland Bracewell Shoemaker	03fdd65bfe	Add gRPC server to SA (#2374 ) Adds a gRPC server to the SA and SA gRPC Clients to the WFE, RA, CA, Publisher, OCSP updater, orphan finder, admin revoker, and expiration mailer. Also adds a CA gRPC client to the OCSP Updater which was missed in #2193. Fixes #2347.	2016-12-02 17:24:46 -08:00
Jacob Hoffman-Andrews	7c624d013e	Remove RA->{VA,CA} AMQP configs. (#2371 )	2016-11-30 08:55:03 -05:00
Roland Bracewell Shoemaker	a87379bc6e	Add gRPC server to RA (#2350 ) Fixes #2348.	2016-11-29 15:34:35 -08:00
Roland Bracewell Shoemaker	c5f99453a9	Switch CT submission RPC from CA -> RA (#2304 ) With the current gRPC design the CA talks directly to the Publisher when calling SubmitToCT which crosses security bounadries (secure internal segment -> internet facing segment) which is dangerous if (however unlikely) the Publisher is compromised and there is a gRPC exploit that allows memory corruption on the caller end of a RPC which could expose sensitive information or cause arbitrary issuance. Instead we move the RPC call to the RA which is in a less sensitive network segment. Switching the call site from the CA -> RA is gated on adding the gRPC PublisherService object to the RA config. Fixes #2202.	2016-11-08 11:39:02 -08:00
Daniel McCarney	6c983e8c9e	Implements client whitelisting for gRPC. (#2307 ) As described in #2282, our gRPC code uses mutual TLS to authenticate both clients and servers. However, currently our gRPC servers will accept any client certificate signed by the internal CA we use to authenticate connections. Instead, we would like each server to have a list of which clients it will accept. This will improve security by preventing the compromise of one client private key being used to access endpoints unrelated to its intended scope/purpose. This PR implements support for gRPC servers to specify a list of accepted client names. A `serverTransportCredentials` implementing `ServerHandshake` uses a `verifyClient` function to enforce that the connecting peer presents a client certificate with a SAN entry that matches an entry on the list of accepted client names The `NewServer` function from `grpc/server.go` is updated to instantiate the `serverTransportCredentials` used by `grpc.NewServer`, specifying an accepted names list populated from the `cmd.GRPCServerConfig.ClientNames` config field. The pre-existing client and server certificates in `test/grpc-creds/` are replaced by versions that contain SAN entries as well as subject common names. A DNS and an IP SAN entry are added to allow testing both methods of specifying allowed SANs. The `generate.sh` script is converted to use @jsha's `minica` tool (OpenSSL CLI is blech!). An example client whitelist is added to each of the existing gRPC endpoints in config-next/ to allow the SAN of the test RPC client certificate. Resolves #2282	2016-11-08 13:57:34 -05:00
Daniel McCarney	eb67ad4f88	Allow `validateEmail` to timeout w/o error. (#2288 ) This PR reworks the validateEmail() function from the RA to allow timeouts during DNS validation of MX/A/AAAA records for an email to be non-fatal and match our intention to verify emails best-effort. Notes: bdns/problem.go - DNSError.Timeout() was changed to also include context cancellation and timeout as DNS timeouts. This matches what DNSError.Error() was doing to set the error message and supports external callers to Timeout not duplicating the work. bdns/mocks.go - the LookupMX mock was changed to support always.error and always.timeout in a manner similar to the LookupHost mock. Otherwise the TestValidateEmail unit test for the RA would fail when the MX lookup completed before the Host lookup because the error wouldn't be correct (empty DNS records vs a timeout or network error). test/config/ra.json, test/config-next/ra.json - the dnsTries and dnsTimeout values were updated such that dnsTries * dnsTimeout was <= the WFE->RA RPC timeout (currently 15s in the test configs). This allows the dns lookups to all timeout without the overall RPC timing out. Resolves #2260.	2016-10-27 11:56:12 -07:00
Roland Bracewell Shoemaker	ce679bad41	Implement key rollover (#2231 ) Fixes #503. Functionality is gated by the feature flag `AllowKeyRollover`. Since this functionality is only specified in ACME draft-03 and we mostly implement the draft-02 style this takes some liberties in the implementation, which are described in the updated divergences doc. The `key-change` resource is used to side-step draft-03 `url` requirement.	2016-10-27 10:22:09 -04:00
Roland Bracewell Shoemaker	5fabc90a16	Add IDN support (#2215 ) Add feature flagged support for issuing for IDNs, fixes #597. This patch expects that clients have performed valid IDN2008 encoding on any label that includes unicode characters. Invalid encodings (including non-compatible IDN2003 encoding) will be rejected. No script-mixing or script exclusion checks are performed as we assume that if a name is resolvable that it conforms to the registrar's policies on these matters and if it uses non-standard scripts in sub-domains etc that browsers should be the ones choosing how to display those names. Required a full update of the golang.org/x/net tree to pull in golang.org/x/net/idna, all test suites pass.	2016-10-06 13:05:37 -04:00

1 2 3 4

155 Commits