boulder

Commit Graph

Author	SHA1	Message	Date
Aaron Gable	78e4e82ffa	Feature cleanup (#7320 ) Remove three deprecated feature flags which have been removed from all production configs: - StoreLintingCertificateInsteadOfPrecertificate - LeaseCRLShards - AllowUnrecognizedFeatures Deprecate three flags which are set to true in all production configs: - CAAAfterValidation - AllowNoCommonName - SHA256SubjectKeyIdentifier IN-9879 tracked the removal of these flags.	2024-02-13 17:42:27 -08:00
Jacob Hoffman-Andrews	3865b46638	va: return error instead of ProblemDetails (#7313 ) This allows us to defer creating the user-friendly ProblemDetails to the highest level (va.PerformValidation), which in turn makes it possible to log the original error alongside the user-friendly error. It also reduces the likelihood of "boxed nil" bugs. Many of the unittests check for a specific ProblemDetails.Type and specific Details contents. These test against the output of `detailedError`, which transforms `error` into `ProblemDetails`. So the updates to the tests include insertion of `detailedError(err)` in many places. Several places that were returning a specific ProblemDetails.Type instead return the corresponding `berrors` type. This follows a pattern that `berrors` was designed to enable: use the `berrors` types internally and transform into `ProblemDetails` at the edge where we are rendering something to present to the user: WFE, and now VA.	2024-02-12 11:34:49 -08:00
Jacob Hoffman-Andrews	c2f7bfd645	va: remove arg from processRemoteCAAResults (#7314 ) In the current code, when `processRemoteCAAResults` is called, its `primaryResult` parameter is always set to `nil`. So we can simplify by removing that parameter.	2024-02-09 12:27:15 -08:00
Aaron Gable	3ac6996600	Recognize, but do not process, issuemail CAA tags (#7307 ) This allows Let's Encrypt applicants/subscribers to have critical issuemail CAA property tags without causing Let's Encrypt to bail out due to an unknown critical tag. The issuemail property tag was defined in RFC 9495 (https://www.rfc-editor.org/rfc/rfc9495.html). Fixes https://github.com/letsencrypt/boulder/issues/7301	2024-02-07 09:09:07 -08:00
Phil Porada	0e9f5d3545	va: Audit log which DNS resolver performs a lookup (#7271 ) Adds the chosen DNS resolver to the VAs `ValidationRecord` object so that for each challenge type during a validation, boulder can audit log the resolver(s) chosen to fulfill the request.. Fixes https://github.com/letsencrypt/boulder/issues/7140	2024-02-05 14:26:39 -05:00
Phil Porada	03152aadc6	RVA: Recheck CAA records (#7221 ) Previously, `va.IsCAAValid` would only check CAA records from the primary VA during initial domain control validation, completely ignoring any configured RVAs. The upcoming [MPIC](https://github.com/ryancdickson/staging/pull/8) ballot will require that it be done from multiple perspectives. With the currently deployed [Multi-Perspective Validation](https://letsencrypt.org/2020/02/19/multi-perspective-validation.html) in staging and production, this change brings us in line with the [proposed phase 3](https://github.com/ryancdickson/staging/pull/8/files#r1368708684). This change reuses the existing [MaxRemoteValidationFailures](`21fc191273/cmd/boulder-va/main.go (L35)`) variable for the required non-corroboration quorum. > Phase 3: June 15, 2025 - December 14, 2025 ("CAs MUST implement MPIC in blocking mode"): > > MUST implement MPIC? Yes > Required quorum?: Minimally, 2 remote perspectives must be used. If using less than 6 remote perspectives, 1 non-corroboration is allowed. If using 6 or more remote perspectives, 2 non-corroborations are allowed. > MUST block issuance if quorum is not met: Yes. > Geographic diversity requirements?: Perspectives must be 500km from 1) the primary perspective and 2) all other perspectives used in the quorum. > > Note: "Blocking Mode" is a nickname. As opposed to "monitoring mode" (described in the last milestone), CAs MUST NOT issue a certificate if quorum requirements are not met from this point forward. Adds new VA feature flags: * `EnforceMultiCAA` instructs a primary VA to command each of its configured RVAs to perform a CAA recheck. * `MultiCAAFullResults` causes the primary VA to block waiting for all RVA CAA recheck results to arrive. Renamed `va.logRemoteValidationDifferentials` to `va.logRemoteDifferentials` because it can handle initial domain control validations and CAA rechecking with minimal editing. Part of https://github.com/letsencrypt/boulder/issues/7061	2024-01-25 16:23:25 -05:00
Aaron Gable	d57edfa0f1	Run more go vet checks (#7255 ) Enable the atomicalign, deepequalerrors, findcall, nilness, reflectvaluecompare, sortslice, timeformat, and unusedwrite go vet analyzers, which golangci-lint does not enable by default. Additionally, enable new go vet analyzers by default as they become available. The fieldalignment and shadow analyzers remain disabled because they report so many errors that they should be fixed in a separate PR. Note that the nilness analyzer appears to have found one very real bug in tlsalpn.go.	2024-01-17 12:27:55 -05:00
Viktor Szépe	5c0ca04575	Fix typos (#7241 ) Found new misspellings using the `typos` rust crate: https://crates.io/crates/typos	2024-01-09 13:17:27 -08:00
Aaron Gable	5e1bc3b501	Simplify the features package (#7204 ) Replace the current three-piece setup (enum of feature variables, map of feature vars to default values, and autogenerated bidirectional maps of feature variables to and from strings) with a much simpler one-piece setup: a single struct with one boolean-typed field per feature. This preserves the overall structure of the package -- a single global feature set protected by a mutex, and Set, Reset, and Enabled methods -- although the exact function signatures have all changed somewhat. The executable config format remains the same, so no deployment changes are necessary. This change does deprecate the AllowUnrecognizedFeatures feature, as we cannot tell the json config parser to ignore unknown field names, but that flag is set to False in all of our deployment environments already. Fixes https://github.com/letsencrypt/boulder/issues/6802 Fixes https://github.com/letsencrypt/boulder/issues/5229	2023-12-12 15:51:57 -05:00
Jacob Hoffman-Andrews	c21b376623	Implement DoH for validation queries (#7178 ) Fixes: #7141	2023-12-11 10:49:00 -08:00
Aaron Gable	6c92c3041a	CAA: Treat NXDOMAIN for a TLD as an error (#7104 ) Change the CAA NXDOMAIN carve-out to only apply to registered domains and their subdomains, not to TLDs. Our CAA lookup function has a carveout that allows queries which receive an NXDOMAIN response to be treated as though they received a successful empty response. This is important due to the confluence of three circumstances: 1) many clients use the DNS-01 method to validate issuance for names which generally don't have publicly-visible A/AAAA records; 2) many ACME clients remove their DNS-01 TXT record promptly after validation has completed; and 3) authorizations may be reused more than 8 hours after they were first validated and CAA was first checked. When these circumstances combine, the DNS rightly returns NXDOMAIN when we re-check CAA at issuance time, because no records exist at all for that name. We have to treat this as permission to issue, the same as any other domain which has no CAA records. However, this should never be the case for TLDs: those should always have at least one record. If a TLD returns NXDOMAIN, something much worse has happened -- such as a gTLD being unlisted by ICANN -- and we should treat it as a failure. This change adds a check that the name in question contains at least one dot (".") before triggering the existing carve-out, to make it so that the carve-out does not apply to TLDs. Fixes https://github.com/letsencrypt/boulder/issues/7056	2023-10-02 10:04:39 -07:00
Aaron Gable	3b880e1ccf	Add CAAAfterValidation feature flag (#7082 ) Add a new feature flag "CAAAfterValidation" which, when set to true in the VA, causes the VA to only begin CAA checks after basic domain control validation has completed successfully. This will make successful validations take longer, since the DCV and CAA checks are performed serially instead of in parallel. However, it will also reduce the number of CAA checks we perform by up to 80%, since such a high percentage of validations also fail. IN-9575 tracks enabling this feature flag in staging and prod Fixes https://github.com/letsencrypt/boulder/issues/7058	2023-09-18 13:30:31 -07:00
Aaron Gable	e09c5faf5e	Deprecate CAA AccountURI and ValidationMethods feature flags (#7000 ) These flags are set to true in all environments.	2023-07-14 14:54:39 -04:00
Aaron Gable	cc596bd4eb	Begin testing on go1.21rc2 with loopvar experiment (#6952 ) Add go1.21rc2 to the matrix of go versions we test against. Add a new step to our CI workflows (boulder-ci, try-release, and release) which sets the "GOEXPERIMENT=loopvar" environment variable if we're running go1.21. This experiment makes it so that loop variables are scoped only to their single loop iteration, rather than to the whole loop. This prevents bugs such as our CAA Rechecking incident (https://bugzilla.mozilla.org/show_bug.cgi?id=1619047). Also add a line to our docker setup to propagate this environment variable into the container, where it can affect builds. Finally, fix one TLS-ALPN-01 test to have the fake subscriber server actually willing to negotiate the acme-tls/1 protocol, so that the ACME server's tls client actually waits to (fail to) get the certificate, instead of dying immediately. This fix is related to the upgrade to go1.21, not the loopvar experiment. Fixes https://github.com/letsencrypt/boulder/issues/6950	2023-06-26 16:35:29 -07:00
Jacob Hoffman-Andrews	8dcbc4c92f	Add must.Do utility function (#6955 ) This can take two values (typically the return values of a two-value function) and panic if the error is non-nil, returning the interesting value. This is particularly useful for cases where we statically know the call will succeed. Thanks to @mcpherrinm for the idea!	2023-06-26 14:43:30 -07:00
Aaron Gable	620699216f	Remove the TLS-ALPN-01 tlsDial helper (#6954 ) This minor cleanup was found in the process of fixing tests in https://github.com/letsencrypt/boulder/pull/6952, and resolves a TODO from 2018.	2023-06-26 10:56:52 -07:00
Aaron Gable	2c9925797b	CAA: Don't fail on critical iodef property tags (#6921 ) RFC 8659 (CAA; https://www.rfc-editor.org/rfc/rfc8659) says that "A CA MUST NOT issue certificates for any FQDN if the Relevant RRset for that FQDN contains a CAA critical Property for an unknown or unsupported Property Tag." Let's Encrypt does technically support the iodef property tag: we recognize it, but then ignore it and never choose to send notifications to the given contact address. Historically, we have carried around the iodef property tags in our internal structures as though we might use them, but all code referencing them was essentially dead code. As part of a set of simplifications, https://github.com/letsencrypt/boulder/pull/6886 made it so that we completely ignore iodef property tags. However, this had the unintended side-effect of causing iodef property tags with the Critical bit set to be counted as "unknown critical" tags, which prevent issuance. This change causes our property tag parsing code to recognize iodef tags again, so that critical iodef tags don't prevent issuance.	2023-05-30 11:33:18 -07:00
Phil Porada	c75bf7033a	SA: Don't store HTTP-01 hostname and port in database validationrecord (#6863 ) Removes the `Hostname` and `Port` fields from an http-01 ValidationRecord model prior to storing the record in the database. Using `"hostname":"example.com","port":"80"` as a snippet of a whole validation record, we'll save minimum 36 bytes for each new http-01 ValidationRecord that gets stored. When retrieving the record, the ValidationRecord `RehydrateHostPort` method will repopulate the `Hostname` and `Port` fields from the `URL` field. Fixes the main goal of https://github.com/letsencrypt/boulder/issues/5231. --------- Co-authored-by: Samantha <hello@entropy.cat>	2023-05-23 15:36:17 -04:00
Aaron Gable	3990a08328	Add relevant domain to CAA errors and logs (#6886 ) When processing CAA records, keep track of the FQDN at which that CAA record was found (which may be different from the FQDN for which we are attempting issuance, since we crawl CAA records upwards from the requested name to the TLD). Then surface this name upwards so that it can be included in our own log lines and in the problem documents which we return to clients. Fixes https://github.com/letsencrypt/boulder/issues/3171	2023-05-22 15:08:56 -04:00
alexzorin	0a65e87c1b	va: make http keyAuthz mismatch problem wording less ambiguous (#6903 ) Occasionally (and just now) I've responded to an issue or thread that involves this error message: > The key authorization file from the server did not match this challenge "LoqXcYV8q5ONbJQxbmR7SCTNo3tiAXDfowyjxAjEuX0.9jg46WB3rR_AHD-EBXdN7cBkH1WOu0tA3M9fm21mqTI" != "\xef\xffAABBCC and I've found myself looking at Boulder's source code, to check which way around the values are. I suspect that users are not understanding it either.	2023-05-18 12:04:14 -04:00
Aaron Gable	1fcd951622	Probs: simplifications and cleanup (#6876 ) Make minor, non-user-visible changes to how we structure the probs package. Notably: - Add new problem types for UnsupportedContact and UnsupportedIdentifier, which are specified by RFC8555 and which we will use in the future, but haven't been using historically. - Sort the problem types and constructor functions to match the (alphabetical) order given in RFC8555. - Rename some of the constructor functions to better match their underlying problem types (e.g. "TLSError" to just "TLS"). - Replace the redundant ProblemDetailsToStatusCode function with simply always returning a 500 if we haven't properly set the problem's HTTPStatus. - Remove the ability to use either the V1 or V2 error namespace prefix; always use the proper RFC namespace prefix.	2023-05-12 12:10:13 -04:00
Aaron Gable	9262ca6e3f	Add grpc implementation tests to all services (#6782 ) As a follow-up to #6780, add the same style of implementation test to all of our other gRPC services. This was not included in that PR just to keep it small and single-purpose.	2023-03-31 09:52:26 -07:00
Matthew McPherrin	e1ed1a2ac2	Remove beeline tracing (#6733 ) Remove tracing using Beeline from Boulder. The only remnant left behind is the deprecated configuration, to ensure deployability. We had previously planned to swap in OpenTelemetry in a single PR, but that adds significant churn in a single change, so we're doing this as multiple steps that will each be significantly easier to reason about and review. Part of #6361	2023-03-14 15:14:27 -07:00
Matthew McPherrin	9e2a8d3882	Time only a single iteration of va.fetchHTTP to increase test reliability (#6719 ) Compute "started" and "took" variables inside the retry loop, so that we're actually measuring how long the request which timed out took, rather than measuring how long the request which timed out plus all the previous "network unreachable" attempts took.	2023-03-08 11:57:36 -05:00
Phil Porada	9390c0e5f5	Put errors at end of log lines (#6627 ) For consistency, put the error field at the end of unstructured log lines to make them more ... structured. Adds the `issuerID` field to "orphaning certificate" log line in the CA to match the "orphaning precertificate" log line. Fixes broken tests as a result of the CA and bdns log line change. Fixes #5457	2023-02-03 11:28:38 -05:00
Phil Porada	9d9a2dddcf	Rework VA PortConfig (#6619 ) Remove the PortConfig field from both the VA's config struct and from the NewValidationAuthorityImpl constructor. This config item is no longer used anywhere, and removing this prevents us from accidentally overriding the "Authorized Ports" (80 and 443) which are required by the Baseline Requirements. Unit tests are still able to override the httpPort and tlsPort fields of the ValidationAuthorityImpl. Fixes #3940	2023-01-30 17:03:33 -08:00
Phil Porada	26e5b24585	dependencies: Replace square/go-jose.v2 with go-jose/go-jose.v2 (#6598 ) Fixes #6573	2023-01-24 12:08:30 -05:00
Jacob Hoffman-Andrews	60683e36ee	va: filter invalid UTF-8 in IsCAAValid (#6525 ) Followup from #6506. As noted in that PR: I'm not aware of any way to trigger invalid UTF-8 from the layers below this, so I can't think of a good way to unittest it.	2022-11-21 15:47:48 -08:00
Jacob Hoffman-Andrews	46323d25be	va: filter invalid UTF-8 from ProblemDetails (#6506 ) This avoids serialization errors passing through gRPC. Also, add a pass-through path in replaceInvalidUTF8 that saves an allocation in the trivial case. Fixes #6490	2022-11-21 11:05:21 -08:00
Aaron Gable	7517b0d80f	Rehydrate CAA account and method binding (#6501 ) Make minor changes to our implementation of CAA Account and Method Binding, as a result of reviewing the code in preparation for enabling it in production. Specifically: - Ensure that the validation method and account ID are included at the request level, rather than waiting until we perform the checks which use those parameters; - Clean up code which assumed the validation method and account ID might not be populated; - Use the core.AcmeChallenge type (rather than plain string) for the validation method everywhere; - Update comments to reference the latest version and correct sections of the CAA RFCs; and - Remove the CAA feature flags from the config integration tests to reflect that they are not yet enabled in prod. I have reviewed this code side-by-side with RFC 8659 (CAA) and RFC 8657 (ACME CAA Account and Method Binding) and believe it to be compliant with both.	2022-11-17 13:31:04 -08:00
Aaron Gable	89f7fb1636	Clean up go1.19 TODOs (#6464 ) Clean up several spots where we were behaving differently on go1.18 and go1.19, now that we're using go1.19 everywhere. Also re-enable the lint and generate tests, and fix the various places where the two versions disagreed on how comments should be formatted. Also clean up the OldTLS codepaths, now that both go1.19 and our own feature flags have forbidden TLS < 1.2 everywhere. Fixes #6011	2022-10-21 15:54:18 -07:00
Samantha	bdd9ad9941	grpc: Pass data necessary for Retry-After headers in BoulderErrors (#6415 ) - Add a new field, `RetryAfter` to `BoulderError`s - Add logic to wrap/unwrap the value of the `RetryAfter` field to our gRPC error interceptor - Plumb `RetryAfter` for `DuplicateCertificateError` emitted by RA to the WFE client response header Part of #6256	2022-10-03 16:24:58 -07:00
Samantha	90eb90bdbe	test: Replace sd-test-srv with consul (#6389 ) - Add a dedicated Consul container - Replace `sd-test-srv` with Consul - Add documentation for configuring Consul - Re-issue all gRPC credentials for `<service-name>.service.consul` Part of #6111	2022-09-19 16:13:53 -07:00
Aaron Gable	0340b574d9	Add unparam linter to CI (#6312 ) Enable the "unparam" linter, which checks for unused function parameters, unused function return values, and parameters and return values that always have the same value every time they are used. In addition, fix many instances where the unparam linter complains about our existing codebase. Remove error return values from a number of functions that never return an error, remove or use context and test parameters that were previously unused, and simplify a number of (mostly test-only) functions that always take the same value for their parameter. Most notably, remove the ability to customize the RSA Public Exponent from the ceremony tooling, since it should always be 65537 anyway. Fixes #6104	2022-08-23 12:37:24 -07:00
Aaron Gable	d1b211ec5a	Start testing on go1.19 (#6227 ) Run the Boulder unit and integration tests with go1.19. In addition, make a few small changes to allow both sets of tests to run side-by-side. Mark a few tests, including our lints and generate checks, as go1.18-only. Reformat a few doc comments, particularly lists, to abide by go1.19's stricter gofmt. Causes #6275	2022-08-10 15:30:43 -07:00
Aaron Gable	9c197e1f43	Use io and os instead of deprecated ioutil (#6286 ) The iotuil package has been deprecated since go1.16; the various functions it provided now exist in the os and io packages. Replace all instances of ioutil with either io or os, as appropriate.	2022-08-10 13:30:17 -07:00
Aaron Gable	8227c8fcb2	Fix flake in TestMultiVA (#6141 ) Fix a race condition in one of the TestMultiVA test cases. There are many other possible flakes in these tests, because they use real HTTP to talk to a fake remote VA on localhost, but we can at least trivially remove this race condition. Fixes #6119	2022-05-25 14:25:41 -07:00
Aaron Gable	9b4ca235dd	Update boulder-tools dependencies (#6129 ) Update: - golangci-lint from v1.42.1 to v1.46.2 - protoc from v3.15.6 to v3.20.1 - protoc-gen-go from v1.26.0 to v1.28.0 - protoc-gen-go-grpc from v1.1.0 to v1.2.0 - fpm from v1.14.0 to v1.14.2 Also remove a reference to go1.17.9 from one last place. This does result in updating all of our generated .pb.go files, but only to update the version number embedded in each file's header. Fixes #6123	2022-05-20 14:24:01 -07:00
Aaron Gable	3a94adecb1	Fix nil pointer dereference in TestMultiVA (#6132 ) If the test unexpectedly failed, it would hit a nil pointer dereference on line 511 when it tries to access the `.Type` field of nil. Add another case to handle this.	2022-05-19 17:16:06 -07:00
Aaron Gable	8cb01a0c34	Enable additional linters (#6106 ) These new linters are almost all part of golangci-lint's collection of default linters, that would all be running if we weren't setting `disable-all: true`. By adding them, we now have parity with the default configuration, as well as the additional linters we like. Adds the following linters: * unconvert * deadcode * structcheck * typecheck * varcheck * wastedassign	2022-05-11 13:58:58 -07:00
Jacob Hoffman-Andrews	cf9df961ba	Add feature flags for upcoming deprecations (#6043 ) This adds three features flags: SHA1CSRs, OldTLSOutbound, and OldTLSInbound. Each controls the behavior of an upcoming deprecation (except OldTLSInbound, which isn't yet scheduled for a deprecation but will be soon). Note that these feature flags take advantage of `features`' default values, so they can default to "true" (that is, each of these features is enabled by default), and we set them to "false" in the config JSON to turn them off when the time comes. The unittest for OldTLSOutbound requires that `example.com` resolves to 127.0.0.1. This is because there's logic in the VA that checks that redirected-to hosts end in an IANA TLD. The unittest relies on redirecting, and we can't use e.g. `localhost` in it because of that TLD check, so we use example.com. Fixes #6036 and #6037	2022-04-15 12:14:00 -07:00
Samantha	a9ba5e42a0	VA: Add IP address to detailed errors (#6039 ) Prepend the IP address of the remote host where HTTP-01 or TLS-ALPN-01 validation was attempted in the detailed error response body. Fixes #6016	2022-04-13 12:55:35 -07:00
Aaron Gable	ed912c3aa5	Remove duplication from TLS-ALPN-01 error messages (#6028 ) Slightly refactor `validateTLSALPN01` to use a common function to format the error messages it returns. This reduces code duplication and makes the important validation logic easier to follow. Fixes #5922	2022-04-04 09:17:16 -07:00
Jacob Hoffman-Andrews	07cb1179d0	Add logging of "oldTLS" bit (#6008 ) That causes the VA to emit ValidationRecords with the OldTLS bit set if it observes a redirect to HTTPS that negotiates TLS < 1.2. I've manually tested but there is not yet an integration test. I need to make a parallel change in challtestsrv and then incorporate here.	2022-03-21 11:34:03 -07:00
Aaron Gable	b19b79162f	Minor updates from review of the HTTP-01 method (#5975 ) Make minor updates to our implementation of the HTTP-01 validation method based on in-depth review of BRs Section 3.2.2.4.19 and RFC 8555 Section 8.3. - Move the HTTP response code check above parsing the body. - Explicitly check for 301, 302, 307, and 308 redirect codes, so that if the go stdlib updates to allow additional redirects we don't follow suit. - Trim additional forms of white-space from the key authorization.	2022-03-03 11:23:10 -08:00
Aaron Gable	c94a24897f	Remove go1.16 backwards compatibility hacks (#5952 ) These were needed for the transition from go1.16 to go1.17. We don't run go1.16 anywhere anymore, so they can be removed.	2022-02-22 14:23:28 -08:00
Aaron Gable	cfab636c5a	TLS-ALPN-01: Check that challenge cert is self-signed (#5928 ) RFC 8737 says "The client prepares for validation by constructing a self-signed certificate...". Add a check for whether the challenge certificate is self-signed by ensuring its issuer and subject are equal, and checking its signature with its own public key. Also slightly refactor the helper methods to return only a single cert, since we only care about the first one returned. And add a test.	2022-02-02 16:45:13 -08:00
Aaron Gable	305ef9cce9	Improve error checking paradigm (#5920 ) We have decided that we don't like the if err := call(); err != nil syntax, because it creates confusing scopes, but we have not cleaned up all existing instances of that syntax. However, we have now found a case where that syntax enables a bug: It caused readers to believe that a later err = call() statement was assigning to an already-declared err in the local scope, when in fact it was assigning to an already-declared err in the parent scope of a closure. This caused our ineffassign and staticcheck linters to be unable to analyze the lifetime of the err variable, and so they did not complain when we never checked the actual value of that error. This change standardizes on the two-line error checking syntax everywhere, so that we can more easily ensure that our linters are correctly analyzing all error assignments.	2022-02-01 14:42:43 -07:00
J.C. Jones	f511442e84	Don't allow multiple certificate extensions from TLS-ALPN-01 (#5919 ) Detect when a non-compliant ACME client presents a non-compliant x.509 certificate that contains multiple copies of the SubjectAlternativeName or ACME Identifier extensions; this should be forbidden.	2022-02-01 13:37:51 -08:00
Aaron Gable	25f5e40e77	Require that TLS-ALPN-01 cert have only dnsNames (#5916 ) RFC 8737 states that the certificate presented during the "acme-tls/1" handshake has "a subjectAltName extension containing the dNSName being validated and no other entries". We were checking that it contained no other dNSNames, but not requiring that it not have any other kinds of Subject Alternative Names. Factor all of our SAN checks into a helper function. Have that function construct the expected bytes of the SAN extension from the one DNS name we expect to see, and assert that the actual bytes match the expectation. Add non-DNS-name identifiers to our error output when we encounter a cert whose SANs don't match. And add tests which check that we fail the validation when the cert has multiple SANs.	2022-01-28 15:38:57 -08:00
Aaron Gable	4835709232	Remove support for obsolete id-pe-acmeIdentifier OID (#5906 ) Current metrics show that subscribers present certificates using the obsolete OID to identify their id-pe-acmeIdentifier extension about an order of magnitude less often than they present the correct OID. Remove support for the never-standardized OID.	2022-01-25 10:10:03 -08:00
Aaron Gable	8c28e49ab6	Enforce TLS1.2 when validating TLS-ALPN-01 (#5905 ) RFC 8737, Section 4, states "ACME servers that implement "acme-tls/1" MUST only negotiate TLS 1.2 [RFC5246] or higher when connecting to clients for validation." Enforce that our outgoing connections to validate TLS-ALPN-01 challenges do not negotiate TLS1.1.	2022-01-25 09:57:34 -08:00
Aaron Gable	ab79f96d7b	Fixup staticcheck and stylecheck, and violations thereof (#5897 ) Add `stylecheck` to our list of lints, since it got separated out from `staticcheck`. Fix the way we configure both to be clearer and not rely on regexes. Additionally fix a number of easy-to-change `staticcheck` and `stylecheck` violations, allowing us to reduce our number of ignored checks. Part of #5681	2022-01-20 16:22:30 -08:00
Aaron Gable	2f2bac4bf2	Improve readability of A and AAAA lookup errors (#5843 ) When we query DNS for a host, and both the A and AAAA lookups fail or are empty, combine both errors into a single error rather than only returning the error from the A lookup. Fixes #5819 Fixes #5319	2022-01-03 10:39:25 -08:00
Jacob Hoffman-Andrews	4205400a98	Lower logDNSError to info level. (#5701 ) These log lines are sometimes useful for debugging, but are a normal part of operation, not an error: Unbound will allow a response to timeout if the remote server is too slow.	2021-10-12 10:44:54 -06:00
Samantha	6eee230d69	BDNS: Ensure DNS server addresses are dialable (#5520 ) - Add function `validateServerAddress()` to `bdns/servers.go` which ensures that DNS server addresses are TCP/ UDP dial-able per: https://golang.org/src/net/dial.go?#L281 - Add unit test for `validateServerAddress()` in `bdns/servers_test.go` - Update `cmd/boulder-va/main.go` to handle `bdns.NewStaticProvider()` potentially returning an error. - Update unit tests in `bdns/dns_test.go`: - Handle `bdns.NewStaticProvider()` potentially returning an error - Add an IPv6 address to `TestRotateServerOnErr` - Ensure DNS server addresses are validated by `validateServerAddress` whenever: - `dynamicProvider.update() is called` - `staticProvider` is constructed - Construct server addresses using `net.JoinHostPost()` when `dynamicProvider.Addrs()` is called Fixes #5463	2021-07-20 10:11:11 -07:00
Aaron Gable	4c581436a3	Add go1.17beta1 to CI (#5483 ) Add go1.17beta1 docker images to the set of things we build, and integrate go1.17beta1 into the set of environments CI runs. Fix one test which breaks due to an underlying refactoring in the `crypto/x509` stdlib package. Fix one other test which breaks due to new guarantees in the stdlib's TLS ALPN implementation. Also removes go1.16.5 from CI so we're only running 2 versions. Fixes #5480	2021-07-13 10:00:04 -07:00
Aaron Gable	64c9ec350d	Unify protobuf generation (#5458 ) Create script which finds every .proto file in the repo and correctly invokes `protoc` for each. Create a single file with a `//go:generate` directive to invoke the new script. Delete all of the other generate.go files, so that our proto generation is unified in one place. Fixes #5453	2021-06-07 08:49:15 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Aaron Gable	a19ebfa0e9	VA: Query SRV to preload/cache DNS resolver addrs (#5360 ) Abstract out the way that the bdns library keeps track of the resolvers it uses to do DNS lookups. Create one implementation, the `StaticProvider`, which behaves exactly the same as the old mechanism (providing whatever names or addresses were given in the config). Create another implementation, `DynamicProvider`, which re-resolves the provided name on a regular basis. The dynamic provider consumes a single name, does a lookup on that name for any SRV records suggesting that it is running a DNS service, and then looks up A records to get the address of all the names returned by the SRV query. It exports its successes and failures as a prometheus metric. Finally, update the tests and config-next configs to work with this new mechanism. Give sd-test-srv the capability to respond to SRV queries, and put the names it provides into docker's default DNS resolver. Fixes #5306	2021-04-20 10:11:53 -07:00
Samantha	6cd59b75f2	VA: Don't follow 303 redirects (#5384 ) - VA should reject redirects with an HTTP status code of 303 - Add 303 redirect test Fixes #5358	2021-04-05 11:29:01 -07:00
Jacob Hoffman-Andrews	7194624191	Update grpc and protobuf to latest. (#5369 ) protoc now generates grpc code in a separate file from protobuf code. Also, grpc servers are now required to embed an "unimplemented" interface from the generated .pb.go file, which provides forward compatibility. Update the generate.go files since the invocation for protoc has changed with the split into .pb.org and _grpc.pb.go. Fixes #5368	2021-04-01 17:18:15 -07:00
Aaron Gable	ef1d3c4cde	Standardize on `AssertMetricWithLabelsEquals` (#5371 ) Update all of our tests to use `AssertMetricWithLabelsEquals` instead of combinations of the older `CountFoo` helpers with simple asserts. This coalesces all of our prometheus inspection logic into a single function, allowing the deletion of four separate helper functions.	2021-04-01 15:20:43 -07:00
Andrew Gabbitas	3d9d5e2306	Cleanup go1.15.7 (#5374 ) Remove code that is no longer needed after migrating to go1.16.x. Remove testing with go1.15.7 in the test matrix. Fixes #5321	2021-04-01 10:50:18 -07:00
Andrew Gabbitas	81eed0cd07	Replace invalid UTF-8 in error message (#5341 ) Add processing to http body when it is passed as an error to be properly marshalled for grpc. Fixes #5317	2021-03-16 14:10:16 -06:00
Aaron Gable	95b77dbd25	Remove va gRPC wrapper (#5328 ) Delete the ValidationAuthorityGRPCServer and ...GRPCClient structs, and update references to instead reference the underlying vapb.VAClient type directly. Also delete the core.ValidationAuthority interface. Does not require updating interfaces elsewhere, as the client wrapper already included the variadic grpc.CallOption parameter. Fixes #5325	2021-03-11 15:38:50 -08:00
Andrew Gabbitas	ceffe18dfc	Add testing for golang 1.16 (#5313 ) - Add 1.16.1 to the GitHub CI test matrix - Fix tlsalpn tests for go 1.16.1 but maintain compatibility with 1.15.x - Fix integration tests. Fix: #5301 Fix: #5316	2021-03-11 11:47:41 -08:00
Andrew Gabbitas	f5362fba24	Add Validated time field to challenges (#5288 ) Move the validated timestamp to the RA where the challenge is passed to the SA for database storage. If a challenge becomes valid or invalid, take the validated timestamp and store it in the attemptedAt field of the authz2 table. Upon retrieval of the challenge from the database, add the attemptedAt value to challenge.Validated which is passed back to the WFE and presented to the user as part of the challenge as required in ACME RFC8555. Fix: #5198	2021-03-10 14:39:59 -08:00
Jacob Hoffman-Andrews	2a8f0fe6ac	Rename several items in bdns (#5260 ) [Go style says](https://blog.golang.org/package-names): > Avoid stutter. Since client code uses the package name as a prefix > when referring to the package contents, the names for those contents > need not repeat the package name. The HTTP server provided by the > http package is called Server, not HTTPServer. Client code refers to > this type as http.Server, so there is no ambiguity. Rename DNSClient, DNSClientImpl, NewDNSClientImpl, NewTestDNSClientImpl, DNSError, and MockDNSClient to follow those guidelines. Unexport DNSClientImpl and MockTimeoutError (was only used internally). Make New and NewTest return the Client interface rather than a concrete `impl` type.	2021-01-29 17:20:35 -08:00
Jacob Hoffman-Andrews	2a6cb72518	Speed up VA test. (#5261 ) We had a test that relied on sleeping to hit a timeout. This doesn't remove the sleep, but it does tighten the duration significantly. Brings unit test time for the VA from 11 seconds to 1.7 seconds on my machine.	2021-01-29 17:07:58 -08:00
Andrew Gabbitas	aa20bcaded	Add validated timestamp to challenges (#5253 ) We do not present a validated timestamp in challenges where status = valid as required by RFC8555. This change is the first step to presenting challenge timestamps to the client. It adds a timestamp to each place where we change a challenge to valid. This only displays in the logs and will not display to the subscriber because it is not yet stored somewhere retrievable. The next step will be to store it in the database and then finally present it to the client. Part of #5198	2021-01-29 08:07:32 -08:00
Andrew Gabbitas	a0d12af73c	Detect redirect loops in VA (#5234 ) Currently the VA checks to see how many redirects have been followed and bails out if greater than maxRedirect (10), but it does not check to see if any redirect url has been followed twice which would mean a broken infinite redirect loop. Storing the validation records for these is relatively expensive because we store a record for each hop in the redirect. This change checks the previous redirect records to see if the URL has been used before and error if it has. This will catch a redirect loop earlier than the maxRedirect value in most cases. Fixes #5224	2021-01-19 16:38:03 -08:00
Samantha	802d4fed9d	Return full CAA RR response from bdns to va (#5181 ) When the VA encounters CAA records, it logs the contents of those records. When those records were the result of following a chain of CNAMEs, the CNAMEs are included as part of the response from our recursive resolver. However, the current flow for logging the responses logs only the CAA records, not the CNAMEs. This change returns the complete dig-style RR response from bdns to the va where the response of the authoritative CAA RR is string-quoted and logged. This dig-style RR response is quite verbose, however it is only ever returned from bdns.LookupCAA when a CAA response is non-empty. If the CAA response is empty only an empty string is returned. Fixes #5082	2020-12-10 18:17:04 -08:00
Aaron Gable	294d1c31d7	Use error wrapping for berrors and tests (#5169 ) This change adds two new test assertion helpers, `AssertErrorIs` and `AssertErrorWraps`. The former is a wrapper around `errors.Is`, and asserts that the error's wrapping chain contains a specific (i.e. singleton) error. The latter is a wrapper around `errors.As`, and asserts that the error's wrapping chain contains any error which is of the given type; it also has the same unwrapping side effect as `errors.As`, which can be useful for further assertions about the contents of the error. It also makes two small changes to our `berrors` package, namely making `berrors.ErrorType` itself an error rather than just an int, and giving `berrors.BoulderError` an `Unwrap()` method which exposes that inner `ErrorType`. This allows us to use the two new helpers above to make assertions about berrors, rather than having to hand-roll equality assertions about their types. Finally, it takes advantage of the two changes above to greatly simplify many of the assertions in our tests, removing conditional checks and replacing them with simple assertions.	2020-11-06 13:17:11 -08:00
Samantha	387e94407c	va: replacing error assertions with errors.As (#5136 ) errors.As checks for a specific error in a wrapped error chain (see https://golang.org/pkg/errors/#As) as opposed to asserting that an error is of a specific type. Part of #5010	2020-10-30 15:51:29 -07:00
Jacob Hoffman-Andrews	bf7c80792d	core: move to proto3 (#5063 ) Builds on #5062 Part of #5050	2020-08-31 17:58:32 -07:00
Aaron Gable	8556d8a801	Update VA RPCs to proto3 (#5005 ) This updates va.proto to use proto3 syntax, and updates all clients of the autogenerated code to use the new types. In particular, it removes indirection from built-in types (proto3 uses ints, rather than pointers to ints, for example). Depends on #5003 Fixes #4956	2020-08-17 15:20:51 -07:00
Aaron Gable	e2c8f6743a	Introduce new core.AcmeChallenge type (#5012 ) ACME Challenges are well-known strings ("http-01", "dns-01", and "tlsalpn-01") identifying which kind of challenge should be used to verify control of a domain. Because they are well-known and only certain values are valid, it is better to represent them as something more akin to an enum than as bare strings. This also improves our ability to ensure that an AcmeChallenge is not accidentally used as some other kind of string in a different context. This change also brings them closer in line with the existing core.AcmeResource and core.OCSPStatus string enums. Fixes #5009	2020-08-11 15:02:16 -07:00
Aaron Gable	8920b698ea	Report canceled remote validations as problems (#5011 ) Previously, canceled remote validations were simply noted and then dropped on the floor. This should be safe, as they're theoretically only canceled when the parent span (i.e. the local PerformValidation RPC) ends. But for the sake of defense-in-depth, it seems better to correctly mark canceled remote validations as having Problems, so that their results cannot be accidentally used anywhere. This results in a test behavior change: if EnforceMultiVA is on, and some RPCs are canceled, this now results in validation failure. This should not have any production impact, because remote validations should only be canceled when the parent RPC early-exits, but that only happens when EnforceMultiVA is not enabled. These tests now test a case where the other remote validations were canceled for some other reason, which should result in validation failure.	2020-08-11 09:29:49 -07:00
Aaron Gable	0f5d2064a8	Remove logic from VA PerformValidation wrapper (#5003 ) Updates the type of the ValidationAuthority's PerformValidation method to be identical to that of the corresponding auto-generated grpc method, i.e. directly taking and returning proto message types, rather than exploded arguments. This allows all logic to be removed from the VA wrappers, which will allow them to be fully removed after the migration to proto3. Also updates all tests and VA clients to adopt the new interface. Depends on #4983 (do not review first four commits) Part of #4956	2020-08-06 10:45:35 -07:00
Aaron Gable	634d57ce86	Use 2-space indents in all proto files (#5006 ) Our proto files had a variety of indentation styles: 2 spaces, 4 spaces, 8 spaces, and tabs; sometimes mixed within the same file. The proto3 style guide[1] says to use 2-space indents, so this change standardizes on that. [1] https://developers.google.com/protocol-buffers/docs/style	2020-08-05 10:38:19 -07:00
Roland Bracewell Shoemaker	75b034637b	Update travis go versions (remove 1.14.1, add 1.15rc1) (#5002 ) Fixes #4919.	2020-08-04 12:13:09 -07:00
Aaron Gable	7e626b63a6	Temporarily revert CA and VA proto3 migrations (#4962 )	2020-07-16 14:29:42 -07:00
Aaron Gable	281575433b	Switch VA RPCs to proto3 (#4960 ) This updates va.proto to use proto3 syntax, and updates all clients of the autogenerated code to use the new types. In particular, it removes indirection from built-in types (proto3 uses ints, rather than pointers to ints, for example). Fixes #4956	2020-07-16 09:16:23 -07:00
orangepizza	dee757c057	Remove multiva exception list code (#4933 ) Fixes #4931	2020-07-08 10:57:17 -07:00
Roland Bracewell Shoemaker	325bba3a6f	va: measure local validation latency separately (#4865 )	2020-06-12 12:44:25 -07:00
Jacob Hoffman-Andrews	b1347fb3b3	Upgrade to latest protoc and protoc-gen-go (#4794 ) There are some changes to the code generated in the latest version, so this modifies every .pb.go file. Also, the way protoc-gen-go decides where to put files has changed, so each generate.go gets the --go_opt=paths=source_relative flag to tell protoc to continue placing output next to the input. Remove staticcheck from build.sh; we get it via golangci-lint now. Pass --no-document to gem install fpm; this is recommended in the fpm docs.	2020-04-23 18:54:44 -07:00
Jacob Hoffman-Andrews	4a2029b293	Use explicit fmt.Sprintf for ProblemDetails (#4787 ) In #3708, we added formatters for the the convenience methods in the `probs` package. However, in #4783, @alexzorin pointed out that we were incorrectly passing an error message through fmt.Sprintf as the format parameter rather than as a value parameter. I proposed a fix in #4784, but during code review we concluded that the underlying problem was the pattern of using format-style functions that don't have some variant of printf in the name. That makes this wrong: `probs.DNS(err.Error())`, and this right: `probs.DNS("%s", err)`. Since that's an easy mistake to make and a hard one to spot during code review, we're going to stop using this particular pattern and call `fmt.Sprintf` directly. This PR reverts #3708 and adds some `fmt.Sprintf` where needed.	2020-04-21 14:36:11 -07:00
Jacob Hoffman-Andrews	2d7337dcd0	Remove newlines from log messages. (#4777 ) Since Boulder's log system adds checksums to lines, but log-validator processes entries on a per-line basis, including newlines in log messages can cause a validation failure.	2020-04-16 16:49:08 -07:00
Jacob Hoffman-Andrews	bc528cf8cd	Error when redirect target is too long. (#4775 ) This can happen when a misconfiguration redirects a certain path to itself, doubled. After 10 redirects the error message can get quite long. Instead we halt things at 2000 bytes, which should be more than enough.	2020-04-15 13:44:26 -07:00
Jacob Hoffman-Andrews	72deb5b798	gofmt code with -s (simplify) flag (#4763 ) Found by golangci-lint's `gofmt` linter.	2020-04-08 17:25:35 -07:00
Jacob Hoffman-Andrews	75024c3ec1	Replace clock.Default() with clock.New() (#4761 ) clock.Default is deprecated: https://godoc.org/github.com/jmhodges/clock#Default	2020-04-08 17:23:43 -07:00
Jacob Hoffman-Andrews	cdb0bddbd8	Prefix error names with "Err" (#4755 ) Staticcheck cleanup: https://staticcheck.io/docs/checks#ST1012	2020-04-08 17:19:35 -07:00
Jacob Hoffman-Andrews	27e785f3f2	VA: Add "During secondary validation:" error prefix. (#4677 ) This should make it easier to distinguish errors that are triggered by remote failures rather than local ones.	2020-02-14 14:00:08 -05:00
Daniel McCarney	f1894f8d1d	tidy: typo fixes flagged by codespell (#4634 )	2020-01-07 14:01:26 -05:00
Roland Bracewell Shoemaker	5b2f11e07e	Switch away from old style statsd metrics wrappers (#4606 ) In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back. There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change. Fixes #4591.	2019-12-18 11:08:25 -05:00
Daniel McCarney	6ed4ce23a8	bdns: move logDNSError to exchangeOne, log ErrId specially. (#4553 ) We've found we need the context offered from logging the error closer to when it happens in the `bdns` package rather than in the `va`. Adopting the function requires adapting it slightly. Specifically in the new location we know it won't be called with any timeout results, with a non-dns error, or with a nil underlying error. Having the logging done in `bdns` (and specifically from `exchangeOne`) also lets us log the wire format of the query and response when we get a `dns.ErrId` error indicating a query/response ID mismatch. A small unit test is included that ensures the logging happens as expected. In case it proves useful for matching against other metrics the DNS ID mismatch error case also now increments a dedicated prometheus counter vector stat, `dns_id_mismatch`. The stat is labelled by resolver and query type. Resolves https://github.com/letsencrypt/boulder/issues/4532	2019-11-15 16:03:45 -05:00
Jacob Hoffman-Andrews	7f6caddc5b	VA: log internal DNS errors. (#4520 ) When we get a DNS error that has an internal cause (like connection refused), we return a generic message like "networking error" to the user to avoid revealing details that would be confusing. However, when debugging problems with our own services, it's useful to have the underlying errors. This adds a helper method in the VA and calls it from each place we use DNS errors.	2019-11-04 09:09:24 -05:00
Daniel McCarney	7b60b57c33	va: log account ID in multi VA differential JSON. (#4521 ) This will reduce the amount of analysis time required to identify large integrators that aren't compatible with multi VA.	2019-10-31 13:12:28 -04:00
Daniel McCarney	2926074a29	CI/Dev: enable TLS 1.3 (#4489 ) Also update the VA's TLS-ALPN-01 TLS 1.3 unit test to not expect a failure.	2019-10-17 14:01:38 -04:00

1 2 3 4 5 ...

581 Commits