Commit Graph

523 Commits

Author SHA1 Message Date
Aaron Gable 5e1bc3b501
Simplify the features package (#7204)
Replace the current three-piece setup (enum of feature variables, map of
feature vars to default values, and autogenerated bidirectional maps of
feature variables to and from strings) with a much simpler one-piece
setup: a single struct with one boolean-typed field per feature. This
preserves the overall structure of the package -- a single global
feature set protected by a mutex, and Set, Reset, and Enabled methods --
although the exact function signatures have all changed somewhat.

The executable config format remains the same, so no deployment changes
are necessary. This change does deprecate the AllowUnrecognizedFeatures
feature, as we cannot tell the json config parser to ignore unknown
field names, but that flag is set to False in all of our deployment
environments already.

Fixes https://github.com/letsencrypt/boulder/issues/6802
Fixes https://github.com/letsencrypt/boulder/issues/5229
2023-12-12 15:51:57 -05:00
Jacob Hoffman-Andrews c21b376623
Implement DoH for validation queries (#7178)
Fixes: #7141
2023-12-11 10:49:00 -08:00
Aaron Gable 6c92c3041a
CAA: Treat NXDOMAIN for a TLD as an error (#7104)
Change the CAA NXDOMAIN carve-out to only apply to registered domains
and their subdomains, not to TLDs.

Our CAA lookup function has a carveout that allows queries which receive
an NXDOMAIN response to be treated as though they received a successful
empty response. This is important due to the confluence of three
circumstances:
1) many clients use the DNS-01 method to validate issuance for names
which generally don't have publicly-visible A/AAAA records;
2) many ACME clients remove their DNS-01 TXT record promptly after
validation has completed; and
3) authorizations may be reused more than 8 hours after they were first
validated and CAA was first checked.

When these circumstances combine, the DNS rightly returns NXDOMAIN when
we re-check CAA at issuance time, because no records exist at all for
that name. We have to treat this as permission to issue, the same as any
other domain which has no CAA records.

However, this should never be the case for TLDs: those should always
have at least one record. If a TLD returns NXDOMAIN, something much
worse has happened -- such as a gTLD being unlisted by ICANN -- and we
should treat it as a failure.

This change adds a check that the name in question contains at least one
dot (".") before triggering the existing carve-out, to make it so that
the carve-out does not apply to TLDs.

Fixes https://github.com/letsencrypt/boulder/issues/7056
2023-10-02 10:04:39 -07:00
Aaron Gable 3b880e1ccf
Add CAAAfterValidation feature flag (#7082)
Add a new feature flag "CAAAfterValidation" which, when set to true in
the VA, causes the VA to only begin CAA checks after basic domain
control validation has completed successfully. This will make successful
validations take longer, since the DCV and CAA checks are performed
serially instead of in parallel. However, it will also reduce the number
of CAA checks we perform by up to 80%, since such a high percentage of
validations also fail.

IN-9575 tracks enabling this feature flag in staging and prod
Fixes https://github.com/letsencrypt/boulder/issues/7058
2023-09-18 13:30:31 -07:00
Aaron Gable e09c5faf5e
Deprecate CAA AccountURI and ValidationMethods feature flags (#7000)
These flags are set to true in all environments.
2023-07-14 14:54:39 -04:00
Aaron Gable cc596bd4eb
Begin testing on go1.21rc2 with loopvar experiment (#6952)
Add go1.21rc2 to the matrix of go versions we test against.

Add a new step to our CI workflows (boulder-ci, try-release, and
release) which sets the "GOEXPERIMENT=loopvar" environment variable if
we're running go1.21. This experiment makes it so that loop variables
are scoped only to their single loop iteration, rather than to the whole
loop. This prevents bugs such as our CAA Rechecking incident
(https://bugzilla.mozilla.org/show_bug.cgi?id=1619047). Also add a line
to our docker setup to propagate this environment variable into the
container, where it can affect builds.

Finally, fix one TLS-ALPN-01 test to have the fake subscriber server
actually willing to negotiate the acme-tls/1 protocol, so that the ACME
server's tls client actually waits to (fail to) get the certificate,
instead of dying immediately. This fix is related to the upgrade to
go1.21, not the loopvar experiment.

Fixes https://github.com/letsencrypt/boulder/issues/6950
2023-06-26 16:35:29 -07:00
Jacob Hoffman-Andrews 8dcbc4c92f
Add must.Do utility function (#6955)
This can take two values (typically the return values of a two-value
function) and panic if the error is non-nil, returning the interesting
value. This is particularly useful for cases where we statically know
the call will succeed.

Thanks to @mcpherrinm for the idea!
2023-06-26 14:43:30 -07:00
Aaron Gable 620699216f
Remove the TLS-ALPN-01 tlsDial helper (#6954)
This minor cleanup was found in the process of fixing tests in
https://github.com/letsencrypt/boulder/pull/6952, and resolves a TODO
from 2018.
2023-06-26 10:56:52 -07:00
Aaron Gable 2c9925797b
CAA: Don't fail on critical iodef property tags (#6921)
RFC 8659 (CAA; https://www.rfc-editor.org/rfc/rfc8659) says that "A CA
MUST NOT issue certificates for any FQDN if the Relevant RRset for that
FQDN contains a CAA critical Property for an unknown or unsupported
Property Tag."

Let's Encrypt does technically support the iodef property tag: we
recognize it, but then ignore it and never choose to send notifications
to the given contact address. Historically, we have carried around the
iodef property tags in our internal structures as though we might use
them, but all code referencing them was essentially dead code.

As part of a set of simplifications,
https://github.com/letsencrypt/boulder/pull/6886 made it so that we
completely ignore iodef property tags. However, this had the unintended
side-effect of causing iodef property tags with the Critical bit set to
be counted as "unknown critical" tags, which prevent issuance.

This change causes our property tag parsing code to recognize iodef tags
again, so that critical iodef tags don't prevent issuance.
2023-05-30 11:33:18 -07:00
Phil Porada c75bf7033a
SA: Don't store HTTP-01 hostname and port in database validationrecord (#6863)
Removes the `Hostname` and `Port` fields from an http-01
ValidationRecord model prior to storing the record in the database.
Using `"hostname":"example.com","port":"80"` as a snippet of a whole
validation record, we'll save minimum 36 bytes for each new http-01
ValidationRecord that gets stored. When retrieving the record, the
ValidationRecord `RehydrateHostPort` method will repopulate the
`Hostname` and `Port` fields from the `URL` field.

Fixes the main goal of
https://github.com/letsencrypt/boulder/issues/5231.

---------

Co-authored-by: Samantha <hello@entropy.cat>
2023-05-23 15:36:17 -04:00
Aaron Gable 3990a08328
Add relevant domain to CAA errors and logs (#6886)
When processing CAA records, keep track of the FQDN at which that CAA
record was found (which may be different from the FQDN for which we are
attempting issuance, since we crawl CAA records upwards from the
requested name to the TLD). Then surface this name upwards so that it
can be included in our own log lines and in the problem documents which
we return to clients.

Fixes https://github.com/letsencrypt/boulder/issues/3171
2023-05-22 15:08:56 -04:00
alexzorin 0a65e87c1b
va: make http keyAuthz mismatch problem wording less ambiguous (#6903)
Occasionally (and just now) I've responded to an issue or thread that
involves this error message:

> The key authorization file from the server did not match this
challenge
"LoqXcYV8q5ONbJQxbmR7SCTNo3tiAXDfowyjxAjEuX0.9jg46WB3rR_AHD-EBXdN7cBkH1WOu0tA3M9fm21mqTI"
!= "\xef\xffAABBCC

and I've found myself looking at Boulder's source code, to check which
way around the values are. I suspect that users are not understanding it
either.
2023-05-18 12:04:14 -04:00
Aaron Gable 1fcd951622
Probs: simplifications and cleanup (#6876)
Make minor, non-user-visible changes to how we structure the probs
package. Notably:
- Add new problem types for UnsupportedContact and
UnsupportedIdentifier, which are specified by RFC8555 and which we will
use in the future, but haven't been using historically.
- Sort the problem types and constructor functions to match the
(alphabetical) order given in RFC8555.
- Rename some of the constructor functions to better match their
underlying problem types (e.g. "TLSError" to just "TLS").
- Replace the redundant ProblemDetailsToStatusCode function with simply
always returning a 500 if we haven't properly set the problem's
HTTPStatus.
- Remove the ability to use either the V1 or V2 error namespace prefix;
always use the proper RFC namespace prefix.
2023-05-12 12:10:13 -04:00
Aaron Gable 9262ca6e3f
Add grpc implementation tests to all services (#6782)
As a follow-up to #6780, add the same style of implementation test to
all of our other gRPC services. This was not included in that PR just to
keep it small and single-purpose.
2023-03-31 09:52:26 -07:00
Matthew McPherrin e1ed1a2ac2
Remove beeline tracing (#6733)
Remove tracing using Beeline from Boulder. The only remnant left behind
is the deprecated configuration, to ensure deployability.

We had previously planned to swap in OpenTelemetry in a single PR, but
that adds significant churn in a single change, so we're doing this as
multiple steps that will each be significantly easier to reason about
and review.

Part of #6361
2023-03-14 15:14:27 -07:00
Matthew McPherrin 9e2a8d3882
Time only a single iteration of va.fetchHTTP to increase test reliability (#6719)
Compute "started" and "took" variables inside the retry loop, so that we're 
actually measuring how long the request which timed out took, rather than 
measuring how long the request which timed out plus all the previous "network 
unreachable" attempts took.
2023-03-08 11:57:36 -05:00
Phil Porada 9390c0e5f5
Put errors at end of log lines (#6627)
For consistency, put the error field at the end of unstructured log
lines to make them more ... structured.

Adds the `issuerID` field to "orphaning certificate" log line in the CA
to match the "orphaning precertificate" log line.

Fixes broken tests as a result of the CA and bdns log line change.

Fixes #5457
2023-02-03 11:28:38 -05:00
Phil Porada 9d9a2dddcf
Rework VA PortConfig (#6619)
Remove the PortConfig field from both the VA's config struct and from
the NewValidationAuthorityImpl constructor. This config item is no
longer used anywhere, and removing this prevents us from accidentally
overriding the "Authorized Ports" (80 and 443) which are required by the
Baseline Requirements.

Unit tests are still able to override the httpPort and tlsPort fields of
the ValidationAuthorityImpl.

Fixes #3940
2023-01-30 17:03:33 -08:00
Phil Porada 26e5b24585
dependencies: Replace square/go-jose.v2 with go-jose/go-jose.v2 (#6598)
Fixes #6573
2023-01-24 12:08:30 -05:00
Jacob Hoffman-Andrews 60683e36ee
va: filter invalid UTF-8 in IsCAAValid (#6525)
Followup from #6506.

As noted in that PR: I'm not aware of any way to trigger invalid UTF-8
from the layers below this, so I can't think of a good way to unittest
it.
2022-11-21 15:47:48 -08:00
Jacob Hoffman-Andrews 46323d25be
va: filter invalid UTF-8 from ProblemDetails (#6506)
This avoids serialization errors passing through gRPC.

Also, add a pass-through path in replaceInvalidUTF8 that saves an
allocation in the trivial case.

Fixes #6490
2022-11-21 11:05:21 -08:00
Aaron Gable 7517b0d80f
Rehydrate CAA account and method binding (#6501)
Make minor changes to our implementation of CAA Account and Method
Binding, as a result of reviewing the code in preparation for enabling
it in production. Specifically:
- Ensure that the validation method and account ID are included at the
request level, rather than waiting until we perform the checks which use
those parameters;
- Clean up code which assumed the validation method and account ID might
not be populated;
- Use the core.AcmeChallenge type (rather than plain string) for the
validation method everywhere;
- Update comments to reference the latest version and correct sections
of the CAA RFCs; and
- Remove the CAA feature flags from the config integration tests to
reflect that they are not yet enabled in prod.

I have reviewed this code side-by-side with RFC 8659 (CAA) and RFC 8657
(ACME CAA Account and Method Binding) and believe it to be compliant
with both.
2022-11-17 13:31:04 -08:00
Aaron Gable 89f7fb1636
Clean up go1.19 TODOs (#6464)
Clean up several spots where we were behaving differently on
go1.18 and go1.19, now that we're using go1.19 everywhere. Also
re-enable the lint and generate tests, and fix the various places where
the two versions disagreed on how comments should be formatted.

Also clean up the OldTLS codepaths, now that both go1.19 and our
own feature flags have forbidden TLS < 1.2 everywhere.

Fixes #6011
2022-10-21 15:54:18 -07:00
Samantha bdd9ad9941
grpc: Pass data necessary for Retry-After headers in BoulderErrors (#6415)
- Add a new field, `RetryAfter` to `BoulderError`s
- Add logic to wrap/unwrap the value of the `RetryAfter` field to our gRPC error
  interceptor
- Plumb `RetryAfter` for `DuplicateCertificateError` emitted by RA to the WFE
  client response header
  
Part of #6256
2022-10-03 16:24:58 -07:00
Samantha 90eb90bdbe
test: Replace sd-test-srv with consul (#6389)
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`

Part of #6111
2022-09-19 16:13:53 -07:00
Aaron Gable 0340b574d9
Add unparam linter to CI (#6312)
Enable the "unparam" linter, which checks for unused function
parameters, unused function return values, and parameters and
return values that always have the same value every time they
are used.

In addition, fix many instances where the unparam linter complains
about our existing codebase. Remove error return values from a
number of functions that never return an error, remove or use
context and test parameters that were previously unused, and
simplify a number of (mostly test-only) functions that always take the
same value for their parameter. Most notably, remove the ability to
customize the RSA Public Exponent from the ceremony tooling,
since it should always be 65537 anyway.

Fixes #6104
2022-08-23 12:37:24 -07:00
Aaron Gable d1b211ec5a
Start testing on go1.19 (#6227)
Run the Boulder unit and integration tests with go1.19.

In addition, make a few small changes to allow both sets of
tests to run side-by-side. Mark a few tests, including our lints
and generate checks, as go1.18-only. Reformat a few doc
comments, particularly lists, to abide by go1.19's stricter gofmt.

Causes #6275
2022-08-10 15:30:43 -07:00
Aaron Gable 9c197e1f43
Use io and os instead of deprecated ioutil (#6286)
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
2022-08-10 13:30:17 -07:00
Aaron Gable 8227c8fcb2
Fix flake in TestMultiVA (#6141)
Fix a race condition in one of the TestMultiVA test cases.

There are many other possible flakes in these tests, because
they use real HTTP to talk to a fake remote VA on localhost,
but we can at least trivially remove this race condition.

Fixes #6119
2022-05-25 14:25:41 -07:00
Aaron Gable 9b4ca235dd
Update boulder-tools dependencies (#6129)
Update:
- golangci-lint from v1.42.1 to v1.46.2
- protoc from v3.15.6 to v3.20.1
- protoc-gen-go from v1.26.0 to v1.28.0
- protoc-gen-go-grpc from v1.1.0 to v1.2.0
- fpm from v1.14.0 to v1.14.2

Also remove a reference to go1.17.9 from one last place.

This does result in updating all of our generated .pb.go files, but only
to update the version number embedded in each file's header.

Fixes #6123
2022-05-20 14:24:01 -07:00
Aaron Gable 3a94adecb1
Fix nil pointer dereference in TestMultiVA (#6132)
If the test unexpectedly failed, it would hit a nil pointer dereference
on line 511 when it tries to access the `.Type` field of nil. Add another
case to handle this.
2022-05-19 17:16:06 -07:00
Aaron Gable 8cb01a0c34
Enable additional linters (#6106)
These new linters are almost all part of golangci-lint's collection
of default linters, that would all be running if we weren't setting
`disable-all: true`. By adding them, we now have parity with the
default configuration, as well as the additional linters we like.

Adds the following linters:
* unconvert
* deadcode
* structcheck
* typecheck
* varcheck
* wastedassign
2022-05-11 13:58:58 -07:00
Jacob Hoffman-Andrews cf9df961ba
Add feature flags for upcoming deprecations (#6043)
This adds three features flags: SHA1CSRs, OldTLSOutbound, and
OldTLSInbound. Each controls the behavior of an upcoming deprecation
(except OldTLSInbound, which isn't yet scheduled for a deprecation
but will be soon). Note that these feature flags take advantage of
`features`' default values, so they can default to "true" (that is, each
of these features is enabled by default), and we set them to "false"
in the config JSON to turn them off when the time comes.

The unittest for OldTLSOutbound requires that `example.com` resolves
to 127.0.0.1. This is because there's logic in the VA that checks
that redirected-to hosts end in an IANA TLD. The unittest relies on
redirecting, and we can't use e.g. `localhost` in it because of that
TLD check, so we use example.com.

Fixes #6036 and #6037
2022-04-15 12:14:00 -07:00
Samantha a9ba5e42a0
VA: Add IP address to detailed errors (#6039)
Prepend the IP address of the remote host where HTTP-01 or TLS-ALPN-01
validation was attempted in the detailed error response body.

Fixes #6016
2022-04-13 12:55:35 -07:00
Aaron Gable ed912c3aa5
Remove duplication from TLS-ALPN-01 error messages (#6028)
Slightly refactor `validateTLSALPN01` to use a common function
to format the error messages it returns. This reduces code duplication
and makes the important validation logic easier to follow.

Fixes #5922
2022-04-04 09:17:16 -07:00
Jacob Hoffman-Andrews 07cb1179d0
Add logging of "oldTLS" bit (#6008)
That causes the VA to emit ValidationRecords with the OldTLS bit set if
it observes a redirect to HTTPS that negotiates TLS < 1.2.

I've manually tested but there is not yet an integration test. I need
to make a parallel change in challtestsrv and then incorporate here.
2022-03-21 11:34:03 -07:00
Aaron Gable b19b79162f
Minor updates from review of the HTTP-01 method (#5975)
Make minor updates to our implementation of the HTTP-01 validation method based
on in-depth review of BRs Section 3.2.2.4.19 and RFC 8555 Section 8.3.
- Move the HTTP response code check above parsing the body.
- Explicitly check for 301, 302, 307, and 308 redirect codes, so that if the go
  stdlib updates to allow additional redirects we don't follow suit.
- Trim additional forms of white-space from the key authorization.
2022-03-03 11:23:10 -08:00
Aaron Gable c94a24897f
Remove go1.16 backwards compatibility hacks (#5952)
These were needed for the transition from go1.16 to go1.17. We
don't run go1.16 anywhere anymore, so they can be removed.
2022-02-22 14:23:28 -08:00
Aaron Gable cfab636c5a
TLS-ALPN-01: Check that challenge cert is self-signed (#5928)
RFC 8737 says "The client prepares for validation by constructing a self-signed
certificate...". Add a check for whether the challenge certificate is
self-signed by ensuring its issuer and subject are equal, and checking its
signature with its own public key.

Also slightly refactor the helper methods to return only a single cert, since we
only care about the first one returned. And add a test.
2022-02-02 16:45:13 -08:00
Aaron Gable 305ef9cce9
Improve error checking paradigm (#5920)
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.

This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
2022-02-01 14:42:43 -07:00
J.C. Jones f511442e84
Don't allow multiple certificate extensions from TLS-ALPN-01 (#5919)
Detect when a non-compliant ACME client presents a non-compliant x.509
certificate that contains multiple copies of the SubjectAlternativeName or ACME
Identifier extensions; this should be forbidden.
2022-02-01 13:37:51 -08:00
Aaron Gable 25f5e40e77
Require that TLS-ALPN-01 cert have only dnsNames (#5916)
RFC 8737 states that the certificate presented during the "acme-tls/1"
handshake has "a subjectAltName extension containing the dNSName
being validated and no other entries". We were checking that it contained
no other dNSNames, but not requiring that it not have any other kinds of
Subject Alternative Names.

Factor all of our SAN checks into a helper function. Have that function
construct the expected bytes of the SAN extension from the one DNS
name we expect to see, and assert that the actual bytes match the
expectation. Add non-DNS-name identifiers to our error output when we
encounter a cert whose SANs don't match. And add tests which check
that we fail the validation when the cert has multiple SANs.
2022-01-28 15:38:57 -08:00
Aaron Gable 4835709232
Remove support for obsolete id-pe-acmeIdentifier OID (#5906)
Current metrics show that subscribers present certificates using the
obsolete OID to identify their id-pe-acmeIdentifier extension about
an order of magnitude less often than they present the correct OID.
Remove support for the never-standardized OID.
2022-01-25 10:10:03 -08:00
Aaron Gable 8c28e49ab6
Enforce TLS1.2 when validating TLS-ALPN-01 (#5905)
RFC 8737, Section 4, states "ACME servers that implement "acme-tls/1"
MUST only negotiate TLS 1.2 [RFC5246] or higher when connecting to
clients for validation." Enforce that our outgoing connections to validate
TLS-ALPN-01 challenges do not negotiate TLS1.1.
2022-01-25 09:57:34 -08:00
Aaron Gable ab79f96d7b
Fixup staticcheck and stylecheck, and violations thereof (#5897)
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.

Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.

Part of #5681
2022-01-20 16:22:30 -08:00
Aaron Gable 2f2bac4bf2
Improve readability of A and AAAA lookup errors (#5843)
When we query DNS for a host, and both the A and AAAA lookups fail or
are empty, combine both errors into a single error rather than only
returning the error from the A lookup.

Fixes #5819
Fixes #5319
2022-01-03 10:39:25 -08:00
Jacob Hoffman-Andrews 4205400a98
Lower logDNSError to info level. (#5701)
These log lines are sometimes useful for debugging, but are a normal
part of operation, not an error: Unbound will allow a response to
timeout if the remote server is too slow.
2021-10-12 10:44:54 -06:00
Samantha 6eee230d69
BDNS: Ensure DNS server addresses are dialable (#5520)
- Add function `validateServerAddress()` to `bdns/servers.go` which ensures that
  DNS server addresses are TCP/ UDP dial-able per: https://golang.org/src/net/dial.go?#L281
- Add unit test for `validateServerAddress()` in `bdns/servers_test.go`
- Update `cmd/boulder-va/main.go` to handle `bdns.NewStaticProvider()`
  potentially returning an error.
- Update unit tests in `bdns/dns_test.go`:
  - Handle `bdns.NewStaticProvider()` potentially returning an error
  - Add an IPv6 address to `TestRotateServerOnErr`
- Ensure DNS server addresses are validated by `validateServerAddress` whenever:
  - `dynamicProvider.update() is called`
  - `staticProvider` is constructed
- Construct server addresses using `net.JoinHostPost()` when
  `dynamicProvider.Addrs()` is called

Fixes #5463
2021-07-20 10:11:11 -07:00
Aaron Gable 4c581436a3
Add go1.17beta1 to CI (#5483)
Add go1.17beta1 docker images to the set of things we build,
and integrate go1.17beta1 into the set of environments CI runs.
Fix one test which breaks due to an underlying refactoring in
the `crypto/x509` stdlib package. Fix one other test which breaks
due to new guarantees in the stdlib's TLS ALPN implementation.

Also removes go1.16.5 from CI so we're only running 2 versions.

Fixes #5480
2021-07-13 10:00:04 -07:00
Aaron Gable 64c9ec350d
Unify protobuf generation (#5458)
Create script which finds every .proto file in the repo and correctly
invokes `protoc` for each. Create a single file with a `//go:generate`
directive to invoke the new script. Delete all of the other generate.go
files, so that our proto generation is unified in one place.

Fixes #5453
2021-06-07 08:49:15 -07:00