Commit Graph

581 Commits

Author SHA1 Message Date
Aaron Gable 78e4e82ffa
Feature cleanup (#7320)
Remove three deprecated feature flags which have been removed from all
production configs:
- StoreLintingCertificateInsteadOfPrecertificate
- LeaseCRLShards
- AllowUnrecognizedFeatures

Deprecate three flags which are set to true in all production configs:
- CAAAfterValidation
- AllowNoCommonName
- SHA256SubjectKeyIdentifier

IN-9879 tracked the removal of these flags.
2024-02-13 17:42:27 -08:00
Jacob Hoffman-Andrews 3865b46638
va: return error instead of ProblemDetails (#7313)
This allows us to defer creating the user-friendly ProblemDetails to the
highest level (va.PerformValidation), which in turn makes it possible to
log the original error alongside the user-friendly error. It also
reduces the likelihood of "boxed nil" bugs.

Many of the unittests check for a specific ProblemDetails.Type and
specific Details contents. These test against the output of
`detailedError`, which transforms `error` into `ProblemDetails`. So the
updates to the tests include insertion of `detailedError(err)` in many
places.

Several places that were returning a specific ProblemDetails.Type
instead return the corresponding `berrors` type. This follows a pattern
that `berrors` was designed to enable: use the `berrors` types
internally and transform into `ProblemDetails` at the edge where we are
rendering something to present to the user: WFE, and now VA.
2024-02-12 11:34:49 -08:00
Jacob Hoffman-Andrews c2f7bfd645
va: remove arg from processRemoteCAAResults (#7314)
In the current code, when `processRemoteCAAResults` is called, its
`primaryResult` parameter is always set to `nil`. So we can simplify by
removing that parameter.
2024-02-09 12:27:15 -08:00
Aaron Gable 3ac6996600
Recognize, but do not process, issuemail CAA tags (#7307)
This allows Let's Encrypt applicants/subscribers to have critical
issuemail CAA property tags without causing Let's Encrypt to bail out
due to an unknown critical tag. The issuemail property tag was defined
in RFC 9495 (https://www.rfc-editor.org/rfc/rfc9495.html).

Fixes https://github.com/letsencrypt/boulder/issues/7301
2024-02-07 09:09:07 -08:00
Phil Porada 0e9f5d3545
va: Audit log which DNS resolver performs a lookup (#7271)
Adds the chosen DNS resolver to the VAs `ValidationRecord` object so
that for each challenge type during a validation, boulder can audit log
the resolver(s) chosen to fulfill the request..

Fixes https://github.com/letsencrypt/boulder/issues/7140
2024-02-05 14:26:39 -05:00
Phil Porada 03152aadc6
RVA: Recheck CAA records (#7221)
Previously, `va.IsCAAValid` would only check CAA records from the
primary VA during initial domain control validation, completely ignoring
any configured RVAs. The upcoming
[MPIC](https://github.com/ryancdickson/staging/pull/8) ballot will
require that it be done from multiple perspectives. With the currently
deployed [Multi-Perspective
Validation](https://letsencrypt.org/2020/02/19/multi-perspective-validation.html)
in staging and production, this change brings us in line with the
[proposed phase
3](https://github.com/ryancdickson/staging/pull/8/files#r1368708684).
This change reuses the existing
[MaxRemoteValidationFailures](21fc191273/cmd/boulder-va/main.go (L35))
variable for the required non-corroboration quorum.
> Phase 3: June 15, 2025 - December 14, 2025 ("CAs MUST implement MPIC
in blocking mode*"):
>
>    MUST implement MPIC? Yes
> Required quorum?: Minimally, 2 remote perspectives must be used. If
using less than 6 remote perspectives, 1 non-corroboration is allowed.
If using 6 or more remote perspectives, 2 non-corroborations are
allowed.
>    MUST block issuance if quorum is not met: Yes.
> Geographic diversity requirements?: Perspectives must be 500km from 1)
the primary perspective and 2) all other perspectives used in the
quorum.
>
> * Note: "Blocking Mode" is a nickname. As opposed to "monitoring mode"
(described in the last milestone), CAs MUST NOT issue a certificate if
quorum requirements are not met from this point forward.

Adds new VA feature flags: 
* `EnforceMultiCAA` instructs a primary VA to command each of its
configured RVAs to perform a CAA recheck.
* `MultiCAAFullResults` causes the primary VA to block waiting for all
RVA CAA recheck results to arrive.


Renamed `va.logRemoteValidationDifferentials` to
`va.logRemoteDifferentials` because it can handle initial domain control
validations and CAA rechecking with minimal editing.

Part of https://github.com/letsencrypt/boulder/issues/7061
2024-01-25 16:23:25 -05:00
Aaron Gable d57edfa0f1
Run more go vet checks (#7255)
Enable the atomicalign, deepequalerrors, findcall, nilness,
reflectvaluecompare, sortslice, timeformat, and unusedwrite go vet
analyzers, which golangci-lint does not enable by default. Additionally,
enable new go vet analyzers by default as they become available.

The fieldalignment and shadow analyzers remain disabled because they
report so many errors that they should be fixed in a separate PR.

Note that the nilness analyzer appears to have found one very real bug
in tlsalpn.go.
2024-01-17 12:27:55 -05:00
Viktor Szépe 5c0ca04575
Fix typos (#7241)
Found new misspellings using the `typos` rust crate:
https://crates.io/crates/typos
2024-01-09 13:17:27 -08:00
Aaron Gable 5e1bc3b501
Simplify the features package (#7204)
Replace the current three-piece setup (enum of feature variables, map of
feature vars to default values, and autogenerated bidirectional maps of
feature variables to and from strings) with a much simpler one-piece
setup: a single struct with one boolean-typed field per feature. This
preserves the overall structure of the package -- a single global
feature set protected by a mutex, and Set, Reset, and Enabled methods --
although the exact function signatures have all changed somewhat.

The executable config format remains the same, so no deployment changes
are necessary. This change does deprecate the AllowUnrecognizedFeatures
feature, as we cannot tell the json config parser to ignore unknown
field names, but that flag is set to False in all of our deployment
environments already.

Fixes https://github.com/letsencrypt/boulder/issues/6802
Fixes https://github.com/letsencrypt/boulder/issues/5229
2023-12-12 15:51:57 -05:00
Jacob Hoffman-Andrews c21b376623
Implement DoH for validation queries (#7178)
Fixes: #7141
2023-12-11 10:49:00 -08:00
Aaron Gable 6c92c3041a
CAA: Treat NXDOMAIN for a TLD as an error (#7104)
Change the CAA NXDOMAIN carve-out to only apply to registered domains
and their subdomains, not to TLDs.

Our CAA lookup function has a carveout that allows queries which receive
an NXDOMAIN response to be treated as though they received a successful
empty response. This is important due to the confluence of three
circumstances:
1) many clients use the DNS-01 method to validate issuance for names
which generally don't have publicly-visible A/AAAA records;
2) many ACME clients remove their DNS-01 TXT record promptly after
validation has completed; and
3) authorizations may be reused more than 8 hours after they were first
validated and CAA was first checked.

When these circumstances combine, the DNS rightly returns NXDOMAIN when
we re-check CAA at issuance time, because no records exist at all for
that name. We have to treat this as permission to issue, the same as any
other domain which has no CAA records.

However, this should never be the case for TLDs: those should always
have at least one record. If a TLD returns NXDOMAIN, something much
worse has happened -- such as a gTLD being unlisted by ICANN -- and we
should treat it as a failure.

This change adds a check that the name in question contains at least one
dot (".") before triggering the existing carve-out, to make it so that
the carve-out does not apply to TLDs.

Fixes https://github.com/letsencrypt/boulder/issues/7056
2023-10-02 10:04:39 -07:00
Aaron Gable 3b880e1ccf
Add CAAAfterValidation feature flag (#7082)
Add a new feature flag "CAAAfterValidation" which, when set to true in
the VA, causes the VA to only begin CAA checks after basic domain
control validation has completed successfully. This will make successful
validations take longer, since the DCV and CAA checks are performed
serially instead of in parallel. However, it will also reduce the number
of CAA checks we perform by up to 80%, since such a high percentage of
validations also fail.

IN-9575 tracks enabling this feature flag in staging and prod
Fixes https://github.com/letsencrypt/boulder/issues/7058
2023-09-18 13:30:31 -07:00
Aaron Gable e09c5faf5e
Deprecate CAA AccountURI and ValidationMethods feature flags (#7000)
These flags are set to true in all environments.
2023-07-14 14:54:39 -04:00
Aaron Gable cc596bd4eb
Begin testing on go1.21rc2 with loopvar experiment (#6952)
Add go1.21rc2 to the matrix of go versions we test against.

Add a new step to our CI workflows (boulder-ci, try-release, and
release) which sets the "GOEXPERIMENT=loopvar" environment variable if
we're running go1.21. This experiment makes it so that loop variables
are scoped only to their single loop iteration, rather than to the whole
loop. This prevents bugs such as our CAA Rechecking incident
(https://bugzilla.mozilla.org/show_bug.cgi?id=1619047). Also add a line
to our docker setup to propagate this environment variable into the
container, where it can affect builds.

Finally, fix one TLS-ALPN-01 test to have the fake subscriber server
actually willing to negotiate the acme-tls/1 protocol, so that the ACME
server's tls client actually waits to (fail to) get the certificate,
instead of dying immediately. This fix is related to the upgrade to
go1.21, not the loopvar experiment.

Fixes https://github.com/letsencrypt/boulder/issues/6950
2023-06-26 16:35:29 -07:00
Jacob Hoffman-Andrews 8dcbc4c92f
Add must.Do utility function (#6955)
This can take two values (typically the return values of a two-value
function) and panic if the error is non-nil, returning the interesting
value. This is particularly useful for cases where we statically know
the call will succeed.

Thanks to @mcpherrinm for the idea!
2023-06-26 14:43:30 -07:00
Aaron Gable 620699216f
Remove the TLS-ALPN-01 tlsDial helper (#6954)
This minor cleanup was found in the process of fixing tests in
https://github.com/letsencrypt/boulder/pull/6952, and resolves a TODO
from 2018.
2023-06-26 10:56:52 -07:00
Aaron Gable 2c9925797b
CAA: Don't fail on critical iodef property tags (#6921)
RFC 8659 (CAA; https://www.rfc-editor.org/rfc/rfc8659) says that "A CA
MUST NOT issue certificates for any FQDN if the Relevant RRset for that
FQDN contains a CAA critical Property for an unknown or unsupported
Property Tag."

Let's Encrypt does technically support the iodef property tag: we
recognize it, but then ignore it and never choose to send notifications
to the given contact address. Historically, we have carried around the
iodef property tags in our internal structures as though we might use
them, but all code referencing them was essentially dead code.

As part of a set of simplifications,
https://github.com/letsencrypt/boulder/pull/6886 made it so that we
completely ignore iodef property tags. However, this had the unintended
side-effect of causing iodef property tags with the Critical bit set to
be counted as "unknown critical" tags, which prevent issuance.

This change causes our property tag parsing code to recognize iodef tags
again, so that critical iodef tags don't prevent issuance.
2023-05-30 11:33:18 -07:00
Phil Porada c75bf7033a
SA: Don't store HTTP-01 hostname and port in database validationrecord (#6863)
Removes the `Hostname` and `Port` fields from an http-01
ValidationRecord model prior to storing the record in the database.
Using `"hostname":"example.com","port":"80"` as a snippet of a whole
validation record, we'll save minimum 36 bytes for each new http-01
ValidationRecord that gets stored. When retrieving the record, the
ValidationRecord `RehydrateHostPort` method will repopulate the
`Hostname` and `Port` fields from the `URL` field.

Fixes the main goal of
https://github.com/letsencrypt/boulder/issues/5231.

---------

Co-authored-by: Samantha <hello@entropy.cat>
2023-05-23 15:36:17 -04:00
Aaron Gable 3990a08328
Add relevant domain to CAA errors and logs (#6886)
When processing CAA records, keep track of the FQDN at which that CAA
record was found (which may be different from the FQDN for which we are
attempting issuance, since we crawl CAA records upwards from the
requested name to the TLD). Then surface this name upwards so that it
can be included in our own log lines and in the problem documents which
we return to clients.

Fixes https://github.com/letsencrypt/boulder/issues/3171
2023-05-22 15:08:56 -04:00
alexzorin 0a65e87c1b
va: make http keyAuthz mismatch problem wording less ambiguous (#6903)
Occasionally (and just now) I've responded to an issue or thread that
involves this error message:

> The key authorization file from the server did not match this
challenge
"LoqXcYV8q5ONbJQxbmR7SCTNo3tiAXDfowyjxAjEuX0.9jg46WB3rR_AHD-EBXdN7cBkH1WOu0tA3M9fm21mqTI"
!= "\xef\xffAABBCC

and I've found myself looking at Boulder's source code, to check which
way around the values are. I suspect that users are not understanding it
either.
2023-05-18 12:04:14 -04:00
Aaron Gable 1fcd951622
Probs: simplifications and cleanup (#6876)
Make minor, non-user-visible changes to how we structure the probs
package. Notably:
- Add new problem types for UnsupportedContact and
UnsupportedIdentifier, which are specified by RFC8555 and which we will
use in the future, but haven't been using historically.
- Sort the problem types and constructor functions to match the
(alphabetical) order given in RFC8555.
- Rename some of the constructor functions to better match their
underlying problem types (e.g. "TLSError" to just "TLS").
- Replace the redundant ProblemDetailsToStatusCode function with simply
always returning a 500 if we haven't properly set the problem's
HTTPStatus.
- Remove the ability to use either the V1 or V2 error namespace prefix;
always use the proper RFC namespace prefix.
2023-05-12 12:10:13 -04:00
Aaron Gable 9262ca6e3f
Add grpc implementation tests to all services (#6782)
As a follow-up to #6780, add the same style of implementation test to
all of our other gRPC services. This was not included in that PR just to
keep it small and single-purpose.
2023-03-31 09:52:26 -07:00
Matthew McPherrin e1ed1a2ac2
Remove beeline tracing (#6733)
Remove tracing using Beeline from Boulder. The only remnant left behind
is the deprecated configuration, to ensure deployability.

We had previously planned to swap in OpenTelemetry in a single PR, but
that adds significant churn in a single change, so we're doing this as
multiple steps that will each be significantly easier to reason about
and review.

Part of #6361
2023-03-14 15:14:27 -07:00
Matthew McPherrin 9e2a8d3882
Time only a single iteration of va.fetchHTTP to increase test reliability (#6719)
Compute "started" and "took" variables inside the retry loop, so that we're 
actually measuring how long the request which timed out took, rather than 
measuring how long the request which timed out plus all the previous "network 
unreachable" attempts took.
2023-03-08 11:57:36 -05:00
Phil Porada 9390c0e5f5
Put errors at end of log lines (#6627)
For consistency, put the error field at the end of unstructured log
lines to make them more ... structured.

Adds the `issuerID` field to "orphaning certificate" log line in the CA
to match the "orphaning precertificate" log line.

Fixes broken tests as a result of the CA and bdns log line change.

Fixes #5457
2023-02-03 11:28:38 -05:00
Phil Porada 9d9a2dddcf
Rework VA PortConfig (#6619)
Remove the PortConfig field from both the VA's config struct and from
the NewValidationAuthorityImpl constructor. This config item is no
longer used anywhere, and removing this prevents us from accidentally
overriding the "Authorized Ports" (80 and 443) which are required by the
Baseline Requirements.

Unit tests are still able to override the httpPort and tlsPort fields of
the ValidationAuthorityImpl.

Fixes #3940
2023-01-30 17:03:33 -08:00
Phil Porada 26e5b24585
dependencies: Replace square/go-jose.v2 with go-jose/go-jose.v2 (#6598)
Fixes #6573
2023-01-24 12:08:30 -05:00
Jacob Hoffman-Andrews 60683e36ee
va: filter invalid UTF-8 in IsCAAValid (#6525)
Followup from #6506.

As noted in that PR: I'm not aware of any way to trigger invalid UTF-8
from the layers below this, so I can't think of a good way to unittest
it.
2022-11-21 15:47:48 -08:00
Jacob Hoffman-Andrews 46323d25be
va: filter invalid UTF-8 from ProblemDetails (#6506)
This avoids serialization errors passing through gRPC.

Also, add a pass-through path in replaceInvalidUTF8 that saves an
allocation in the trivial case.

Fixes #6490
2022-11-21 11:05:21 -08:00
Aaron Gable 7517b0d80f
Rehydrate CAA account and method binding (#6501)
Make minor changes to our implementation of CAA Account and Method
Binding, as a result of reviewing the code in preparation for enabling
it in production. Specifically:
- Ensure that the validation method and account ID are included at the
request level, rather than waiting until we perform the checks which use
those parameters;
- Clean up code which assumed the validation method and account ID might
not be populated;
- Use the core.AcmeChallenge type (rather than plain string) for the
validation method everywhere;
- Update comments to reference the latest version and correct sections
of the CAA RFCs; and
- Remove the CAA feature flags from the config integration tests to
reflect that they are not yet enabled in prod.

I have reviewed this code side-by-side with RFC 8659 (CAA) and RFC 8657
(ACME CAA Account and Method Binding) and believe it to be compliant
with both.
2022-11-17 13:31:04 -08:00
Aaron Gable 89f7fb1636
Clean up go1.19 TODOs (#6464)
Clean up several spots where we were behaving differently on
go1.18 and go1.19, now that we're using go1.19 everywhere. Also
re-enable the lint and generate tests, and fix the various places where
the two versions disagreed on how comments should be formatted.

Also clean up the OldTLS codepaths, now that both go1.19 and our
own feature flags have forbidden TLS < 1.2 everywhere.

Fixes #6011
2022-10-21 15:54:18 -07:00
Samantha bdd9ad9941
grpc: Pass data necessary for Retry-After headers in BoulderErrors (#6415)
- Add a new field, `RetryAfter` to `BoulderError`s
- Add logic to wrap/unwrap the value of the `RetryAfter` field to our gRPC error
  interceptor
- Plumb `RetryAfter` for `DuplicateCertificateError` emitted by RA to the WFE
  client response header
  
Part of #6256
2022-10-03 16:24:58 -07:00
Samantha 90eb90bdbe
test: Replace sd-test-srv with consul (#6389)
- Add a dedicated Consul container
- Replace `sd-test-srv` with Consul
- Add documentation for configuring Consul
- Re-issue all gRPC credentials for `<service-name>.service.consul`

Part of #6111
2022-09-19 16:13:53 -07:00
Aaron Gable 0340b574d9
Add unparam linter to CI (#6312)
Enable the "unparam" linter, which checks for unused function
parameters, unused function return values, and parameters and
return values that always have the same value every time they
are used.

In addition, fix many instances where the unparam linter complains
about our existing codebase. Remove error return values from a
number of functions that never return an error, remove or use
context and test parameters that were previously unused, and
simplify a number of (mostly test-only) functions that always take the
same value for their parameter. Most notably, remove the ability to
customize the RSA Public Exponent from the ceremony tooling,
since it should always be 65537 anyway.

Fixes #6104
2022-08-23 12:37:24 -07:00
Aaron Gable d1b211ec5a
Start testing on go1.19 (#6227)
Run the Boulder unit and integration tests with go1.19.

In addition, make a few small changes to allow both sets of
tests to run side-by-side. Mark a few tests, including our lints
and generate checks, as go1.18-only. Reformat a few doc
comments, particularly lists, to abide by go1.19's stricter gofmt.

Causes #6275
2022-08-10 15:30:43 -07:00
Aaron Gable 9c197e1f43
Use io and os instead of deprecated ioutil (#6286)
The iotuil package has been deprecated since go1.16; the various
functions it provided now exist in the os and io packages. Replace all
instances of ioutil with either io or os, as appropriate.
2022-08-10 13:30:17 -07:00
Aaron Gable 8227c8fcb2
Fix flake in TestMultiVA (#6141)
Fix a race condition in one of the TestMultiVA test cases.

There are many other possible flakes in these tests, because
they use real HTTP to talk to a fake remote VA on localhost,
but we can at least trivially remove this race condition.

Fixes #6119
2022-05-25 14:25:41 -07:00
Aaron Gable 9b4ca235dd
Update boulder-tools dependencies (#6129)
Update:
- golangci-lint from v1.42.1 to v1.46.2
- protoc from v3.15.6 to v3.20.1
- protoc-gen-go from v1.26.0 to v1.28.0
- protoc-gen-go-grpc from v1.1.0 to v1.2.0
- fpm from v1.14.0 to v1.14.2

Also remove a reference to go1.17.9 from one last place.

This does result in updating all of our generated .pb.go files, but only
to update the version number embedded in each file's header.

Fixes #6123
2022-05-20 14:24:01 -07:00
Aaron Gable 3a94adecb1
Fix nil pointer dereference in TestMultiVA (#6132)
If the test unexpectedly failed, it would hit a nil pointer dereference
on line 511 when it tries to access the `.Type` field of nil. Add another
case to handle this.
2022-05-19 17:16:06 -07:00
Aaron Gable 8cb01a0c34
Enable additional linters (#6106)
These new linters are almost all part of golangci-lint's collection
of default linters, that would all be running if we weren't setting
`disable-all: true`. By adding them, we now have parity with the
default configuration, as well as the additional linters we like.

Adds the following linters:
* unconvert
* deadcode
* structcheck
* typecheck
* varcheck
* wastedassign
2022-05-11 13:58:58 -07:00
Jacob Hoffman-Andrews cf9df961ba
Add feature flags for upcoming deprecations (#6043)
This adds three features flags: SHA1CSRs, OldTLSOutbound, and
OldTLSInbound. Each controls the behavior of an upcoming deprecation
(except OldTLSInbound, which isn't yet scheduled for a deprecation
but will be soon). Note that these feature flags take advantage of
`features`' default values, so they can default to "true" (that is, each
of these features is enabled by default), and we set them to "false"
in the config JSON to turn them off when the time comes.

The unittest for OldTLSOutbound requires that `example.com` resolves
to 127.0.0.1. This is because there's logic in the VA that checks
that redirected-to hosts end in an IANA TLD. The unittest relies on
redirecting, and we can't use e.g. `localhost` in it because of that
TLD check, so we use example.com.

Fixes #6036 and #6037
2022-04-15 12:14:00 -07:00
Samantha a9ba5e42a0
VA: Add IP address to detailed errors (#6039)
Prepend the IP address of the remote host where HTTP-01 or TLS-ALPN-01
validation was attempted in the detailed error response body.

Fixes #6016
2022-04-13 12:55:35 -07:00
Aaron Gable ed912c3aa5
Remove duplication from TLS-ALPN-01 error messages (#6028)
Slightly refactor `validateTLSALPN01` to use a common function
to format the error messages it returns. This reduces code duplication
and makes the important validation logic easier to follow.

Fixes #5922
2022-04-04 09:17:16 -07:00
Jacob Hoffman-Andrews 07cb1179d0
Add logging of "oldTLS" bit (#6008)
That causes the VA to emit ValidationRecords with the OldTLS bit set if
it observes a redirect to HTTPS that negotiates TLS < 1.2.

I've manually tested but there is not yet an integration test. I need
to make a parallel change in challtestsrv and then incorporate here.
2022-03-21 11:34:03 -07:00
Aaron Gable b19b79162f
Minor updates from review of the HTTP-01 method (#5975)
Make minor updates to our implementation of the HTTP-01 validation method based
on in-depth review of BRs Section 3.2.2.4.19 and RFC 8555 Section 8.3.
- Move the HTTP response code check above parsing the body.
- Explicitly check for 301, 302, 307, and 308 redirect codes, so that if the go
  stdlib updates to allow additional redirects we don't follow suit.
- Trim additional forms of white-space from the key authorization.
2022-03-03 11:23:10 -08:00
Aaron Gable c94a24897f
Remove go1.16 backwards compatibility hacks (#5952)
These were needed for the transition from go1.16 to go1.17. We
don't run go1.16 anywhere anymore, so they can be removed.
2022-02-22 14:23:28 -08:00
Aaron Gable cfab636c5a
TLS-ALPN-01: Check that challenge cert is self-signed (#5928)
RFC 8737 says "The client prepares for validation by constructing a self-signed
certificate...". Add a check for whether the challenge certificate is
self-signed by ensuring its issuer and subject are equal, and checking its
signature with its own public key.

Also slightly refactor the helper methods to return only a single cert, since we
only care about the first one returned. And add a test.
2022-02-02 16:45:13 -08:00
Aaron Gable 305ef9cce9
Improve error checking paradigm (#5920)
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.

This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
2022-02-01 14:42:43 -07:00
J.C. Jones f511442e84
Don't allow multiple certificate extensions from TLS-ALPN-01 (#5919)
Detect when a non-compliant ACME client presents a non-compliant x.509
certificate that contains multiple copies of the SubjectAlternativeName or ACME
Identifier extensions; this should be forbidden.
2022-02-01 13:37:51 -08:00
Aaron Gable 25f5e40e77
Require that TLS-ALPN-01 cert have only dnsNames (#5916)
RFC 8737 states that the certificate presented during the "acme-tls/1"
handshake has "a subjectAltName extension containing the dNSName
being validated and no other entries". We were checking that it contained
no other dNSNames, but not requiring that it not have any other kinds of
Subject Alternative Names.

Factor all of our SAN checks into a helper function. Have that function
construct the expected bytes of the SAN extension from the one DNS
name we expect to see, and assert that the actual bytes match the
expectation. Add non-DNS-name identifiers to our error output when we
encounter a cert whose SANs don't match. And add tests which check
that we fail the validation when the cert has multiple SANs.
2022-01-28 15:38:57 -08:00
Aaron Gable 4835709232
Remove support for obsolete id-pe-acmeIdentifier OID (#5906)
Current metrics show that subscribers present certificates using the
obsolete OID to identify their id-pe-acmeIdentifier extension about
an order of magnitude less often than they present the correct OID.
Remove support for the never-standardized OID.
2022-01-25 10:10:03 -08:00
Aaron Gable 8c28e49ab6
Enforce TLS1.2 when validating TLS-ALPN-01 (#5905)
RFC 8737, Section 4, states "ACME servers that implement "acme-tls/1"
MUST only negotiate TLS 1.2 [RFC5246] or higher when connecting to
clients for validation." Enforce that our outgoing connections to validate
TLS-ALPN-01 challenges do not negotiate TLS1.1.
2022-01-25 09:57:34 -08:00
Aaron Gable ab79f96d7b
Fixup staticcheck and stylecheck, and violations thereof (#5897)
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.

Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.

Part of #5681
2022-01-20 16:22:30 -08:00
Aaron Gable 2f2bac4bf2
Improve readability of A and AAAA lookup errors (#5843)
When we query DNS for a host, and both the A and AAAA lookups fail or
are empty, combine both errors into a single error rather than only
returning the error from the A lookup.

Fixes #5819
Fixes #5319
2022-01-03 10:39:25 -08:00
Jacob Hoffman-Andrews 4205400a98
Lower logDNSError to info level. (#5701)
These log lines are sometimes useful for debugging, but are a normal
part of operation, not an error: Unbound will allow a response to
timeout if the remote server is too slow.
2021-10-12 10:44:54 -06:00
Samantha 6eee230d69
BDNS: Ensure DNS server addresses are dialable (#5520)
- Add function `validateServerAddress()` to `bdns/servers.go` which ensures that
  DNS server addresses are TCP/ UDP dial-able per: https://golang.org/src/net/dial.go?#L281
- Add unit test for `validateServerAddress()` in `bdns/servers_test.go`
- Update `cmd/boulder-va/main.go` to handle `bdns.NewStaticProvider()`
  potentially returning an error.
- Update unit tests in `bdns/dns_test.go`:
  - Handle `bdns.NewStaticProvider()` potentially returning an error
  - Add an IPv6 address to `TestRotateServerOnErr`
- Ensure DNS server addresses are validated by `validateServerAddress` whenever:
  - `dynamicProvider.update() is called`
  - `staticProvider` is constructed
- Construct server addresses using `net.JoinHostPost()` when
  `dynamicProvider.Addrs()` is called

Fixes #5463
2021-07-20 10:11:11 -07:00
Aaron Gable 4c581436a3
Add go1.17beta1 to CI (#5483)
Add go1.17beta1 docker images to the set of things we build,
and integrate go1.17beta1 into the set of environments CI runs.
Fix one test which breaks due to an underlying refactoring in
the `crypto/x509` stdlib package. Fix one other test which breaks
due to new guarantees in the stdlib's TLS ALPN implementation.

Also removes go1.16.5 from CI so we're only running 2 versions.

Fixes #5480
2021-07-13 10:00:04 -07:00
Aaron Gable 64c9ec350d
Unify protobuf generation (#5458)
Create script which finds every .proto file in the repo and correctly
invokes `protoc` for each. Create a single file with a `//go:generate`
directive to invoke the new script. Delete all of the other generate.go
files, so that our proto generation is unified in one place.

Fixes #5453
2021-06-07 08:49:15 -07:00
Aaron Gable 9abb39d4d6
Honeycomb integration proof-of-concept (#5408)
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).

Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.

Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
2021-05-24 16:13:08 -07:00
Aaron Gable a19ebfa0e9
VA: Query SRV to preload/cache DNS resolver addrs (#5360)
Abstract out the way that the bdns library keeps track of the
resolvers it uses to do DNS lookups. Create one implementation,
the `StaticProvider`, which behaves exactly the same as the old
mechanism (providing whatever names or addresses were given
in the config). Create another implementation, `DynamicProvider`,
which re-resolves the provided name on a regular basis.

The dynamic provider consumes a single name, does a lookup
on that name for any SRV records suggesting that it is running a
DNS service, and then looks up A records to get the address of
all the names returned by the SRV query. It exports its successes
and failures as a prometheus metric.

Finally, update the tests and config-next configs to work with
this new mechanism. Give sd-test-srv the capability to respond
to SRV queries, and put the names it provides into docker's
default DNS resolver.

Fixes #5306
2021-04-20 10:11:53 -07:00
Samantha 6cd59b75f2
VA: Don't follow 303 redirects (#5384)
- VA should reject redirects with an HTTP status code of 303
- Add 303 redirect test

Fixes #5358
2021-04-05 11:29:01 -07:00
Jacob Hoffman-Andrews 7194624191
Update grpc and protobuf to latest. (#5369)
protoc now generates grpc code in a separate file from protobuf code.
Also, grpc servers are now required to embed an "unimplemented"
interface from the generated .pb.go file, which provides forward
compatibility.

Update the generate.go files since the invocation for protoc has changed
with the split into .pb.org and _grpc.pb.go.

Fixes #5368
2021-04-01 17:18:15 -07:00
Aaron Gable ef1d3c4cde
Standardize on `AssertMetricWithLabelsEquals` (#5371)
Update all of our tests to use `AssertMetricWithLabelsEquals`
instead of combinations of the older `CountFoo` helpers with
simple asserts. This coalesces all of our prometheus inspection
logic into a single function, allowing the deletion of four separate
helper functions.
2021-04-01 15:20:43 -07:00
Andrew Gabbitas 3d9d5e2306
Cleanup go1.15.7 (#5374)
Remove code that is no longer needed after migrating to go1.16.x.
Remove testing with go1.15.7 in the test matrix.

Fixes #5321
2021-04-01 10:50:18 -07:00
Andrew Gabbitas 81eed0cd07
Replace invalid UTF-8 in error message (#5341)
Add processing to http body when it is passed as an error to be properly
marshalled for grpc.

Fixes #5317
2021-03-16 14:10:16 -06:00
Aaron Gable 95b77dbd25
Remove va gRPC wrapper (#5328)
Delete the ValidationAuthorityGRPCServer and ...GRPCClient structs,
and update references to instead reference the underlying vapb.VAClient
type directly. Also delete the core.ValidationAuthority interface.

Does not require updating interfaces elsewhere, as the client
wrapper already included the variadic grpc.CallOption parameter.

Fixes #5325
2021-03-11 15:38:50 -08:00
Andrew Gabbitas ceffe18dfc
Add testing for golang 1.16 (#5313)
- Add 1.16.1 to the GitHub CI test matrix
- Fix tlsalpn tests for go 1.16.1 but maintain compatibility with 1.15.x
- Fix integration tests.

Fix: #5301
Fix: #5316
2021-03-11 11:47:41 -08:00
Andrew Gabbitas f5362fba24
Add Validated time field to challenges (#5288)
Move the validated timestamp to the RA where the challenge is passed to
the SA for database storage. If a challenge becomes valid or invalid, take
the validated timestamp and store it in the attemptedAt field of the
authz2 table. Upon retrieval of the challenge from the database, add the
attemptedAt value to challenge.Validated which is passed back to the WFE
and presented to the user as part of the challenge as required in ACME
RFC8555.

Fix: #5198
2021-03-10 14:39:59 -08:00
Jacob Hoffman-Andrews 2a8f0fe6ac
Rename several items in bdns (#5260)
[Go style says](https://blog.golang.org/package-names):

> Avoid stutter. Since client code uses the package name as a prefix
> when referring to the package contents, the names for those contents
> need not repeat the package name. The HTTP server provided by the
> http package is called Server, not HTTPServer. Client code refers to
> this type as http.Server, so there is no ambiguity.

Rename DNSClient, DNSClientImpl, NewDNSClientImpl,
NewTestDNSClientImpl, DNSError, and MockDNSClient to follow those
guidelines.

Unexport DNSClientImpl and MockTimeoutError (was only used internally).

Make New and NewTest return the Client interface rather than a concrete
`impl` type.
2021-01-29 17:20:35 -08:00
Jacob Hoffman-Andrews 2a6cb72518
Speed up VA test. (#5261)
We had a test that relied on sleeping to hit a timeout. This doesn't
remove the sleep, but it does tighten the duration significantly. Brings
unit test time for the VA from 11 seconds to 1.7 seconds on my machine.
2021-01-29 17:07:58 -08:00
Andrew Gabbitas aa20bcaded
Add validated timestamp to challenges (#5253)
We do not present a validated timestamp in challenges where status = valid
as required by RFC8555.

This change is the first step to presenting challenge timestamps to the
client. It adds a timestamp to each place where we change a challenge to
valid. This only displays in the logs and will not display to the
subscriber because it is not yet stored somewhere retrievable. The next
step will be to store it in the database and then finally present it to
the client.

Part of #5198
2021-01-29 08:07:32 -08:00
Andrew Gabbitas a0d12af73c
Detect redirect loops in VA (#5234)
Currently the VA checks to see how many redirects have been followed and
bails out if greater than maxRedirect (10), but it does not check to see
if any redirect url has been followed twice which would mean a broken
infinite redirect loop. Storing the validation records for these is
relatively expensive because we store a record for each hop in the
redirect.

This change checks the previous redirect records to see if the URL has
been used before and error if it has. This will catch a redirect loop
earlier than the maxRedirect value in most cases.

Fixes #5224
2021-01-19 16:38:03 -08:00
Samantha 802d4fed9d
Return full CAA RR response from bdns to va (#5181)
When the VA encounters CAA records, it logs the contents of those
records. When those records were the result of following a chain of
CNAMEs, the CNAMEs are included as part of the response from our
recursive resolver. However, the current flow for logging the responses
logs only the CAA records, not the CNAMEs.

This change returns the complete dig-style RR response from bdns to the
va where the response of the authoritative CAA RR is string-quoted and
logged.

This dig-style RR response is quite verbose, however it is only ever
returned from bdns.LookupCAA when a CAA response is non-empty. If the CAA
response is empty only an empty string is returned.

Fixes #5082
2020-12-10 18:17:04 -08:00
Aaron Gable 294d1c31d7
Use error wrapping for berrors and tests (#5169)
This change adds two new test assertion helpers, `AssertErrorIs`
and `AssertErrorWraps`. The former is a wrapper around `errors.Is`,
and asserts that the error's wrapping chain contains a specific (i.e.
singleton) error. The latter is a wrapper around `errors.As`, and
asserts that the error's wrapping chain contains any error which is
of the given type; it also has the same unwrapping side effect as
`errors.As`, which can be useful for further assertions about the
contents of the error.

It also makes two small changes to our `berrors` package, namely
making `berrors.ErrorType` itself an error rather than just an int,
and giving `berrors.BoulderError` an `Unwrap()` method which
exposes that inner `ErrorType`. This allows us to use the two new
helpers above to make assertions about berrors, rather than
having to hand-roll equality assertions about their types.

Finally, it takes advantage of the two changes above to greatly
simplify many of the assertions in our tests, removing conditional
checks and replacing them with simple assertions.
2020-11-06 13:17:11 -08:00
Samantha 387e94407c
va: replacing error assertions with errors.As (#5136)
errors.As checks for a specific error in a wrapped error chain
(see https://golang.org/pkg/errors/#As) as opposed to asserting
that an error is of a specific type.

Part of #5010
2020-10-30 15:51:29 -07:00
Jacob Hoffman-Andrews bf7c80792d
core: move to proto3 (#5063)
Builds on #5062
Part of #5050
2020-08-31 17:58:32 -07:00
Aaron Gable 8556d8a801
Update VA RPCs to proto3 (#5005)
This updates va.proto to use proto3 syntax, and updates
all clients of the autogenerated code to use the new types.
In particular, it removes indirection from built-in types
(proto3 uses ints, rather than pointers to ints, for example).

Depends on #5003
Fixes #4956
2020-08-17 15:20:51 -07:00
Aaron Gable e2c8f6743a
Introduce new core.AcmeChallenge type (#5012)
ACME Challenges are well-known strings ("http-01", "dns-01", and
"tlsalpn-01") identifying which kind of challenge should be used
to verify control of a domain. Because they are well-known and
only certain values are valid, it is better to represent them as
something more akin to an enum than as bare strings. This also
improves our ability to ensure that an AcmeChallenge is not
accidentally used as some other kind of string in a different
context. This change also brings them closer in line with the
existing core.AcmeResource and core.OCSPStatus string enums.

Fixes #5009
2020-08-11 15:02:16 -07:00
Aaron Gable 8920b698ea
Report canceled remote validations as problems (#5011)
Previously, canceled remote validations were simply noted and then
dropped on the floor. This should be safe, as they're theoretically
only canceled when the parent span (i.e. the local PerformValidation
RPC) ends. But for the sake of defense-in-depth, it seems better to
correctly mark canceled remote validations as having Problems, so
that their results cannot be accidentally used anywhere.

This results in a test behavior change: if EnforceMultiVA is on, and
some RPCs are canceled, this now results in validation failure. This
should not have any production impact, because remote validations
should only be canceled when the parent RPC early-exits, but that only
happens when EnforceMultiVA is not enabled. These tests now test a
case where the other remote validations were canceled for some other
reason, which should result in validation failure.
2020-08-11 09:29:49 -07:00
Aaron Gable 0f5d2064a8
Remove logic from VA PerformValidation wrapper (#5003)
Updates the type of the ValidationAuthority's PerformValidation
method to be identical to that of the corresponding auto-generated
grpc method, i.e. directly taking and returning proto message
types, rather than exploded arguments.

This allows all logic to be removed from the VA wrappers, which
will allow them to be fully removed after the migration to proto3.

Also updates all tests and VA clients to adopt the new interface.

Depends on #4983 (do not review first four commits)
Part of #4956
2020-08-06 10:45:35 -07:00
Aaron Gable 634d57ce86
Use 2-space indents in all proto files (#5006)
Our proto files had a variety of indentation styles: 2 spaces,
4 spaces, 8 spaces, and tabs; sometimes mixed within the same
file. The proto3 style guide[1] says to use 2-space indents,
so this change standardizes on that.

[1] https://developers.google.com/protocol-buffers/docs/style
2020-08-05 10:38:19 -07:00
Roland Bracewell Shoemaker 75b034637b
Update travis go versions (remove 1.14.1, add 1.15rc1) (#5002)
Fixes #4919.
2020-08-04 12:13:09 -07:00
Aaron Gable 7e626b63a6
Temporarily revert CA and VA proto3 migrations (#4962) 2020-07-16 14:29:42 -07:00
Aaron Gable 281575433b
Switch VA RPCs to proto3 (#4960)
This updates va.proto to use proto3 syntax, and updates
all clients of the autogenerated code to use the new types. In
particular, it removes indirection from built-in types (proto3
uses ints, rather than pointers to ints, for example).

Fixes #4956
2020-07-16 09:16:23 -07:00
orangepizza dee757c057
Remove multiva exception list code (#4933)
Fixes #4931
2020-07-08 10:57:17 -07:00
Roland Bracewell Shoemaker 325bba3a6f
va: measure local validation latency separately (#4865) 2020-06-12 12:44:25 -07:00
Jacob Hoffman-Andrews b1347fb3b3
Upgrade to latest protoc and protoc-gen-go (#4794)
There are some changes to the code generated in the latest version, so
this modifies every .pb.go file.

Also, the way protoc-gen-go decides where to put files has changed, so
each generate.go gets the --go_opt=paths=source_relative flag to
tell protoc to continue placing output next to the input.

Remove staticcheck from build.sh; we get it via golangci-lint now.

Pass --no-document to gem install fpm; this is recommended in the fpm docs.
2020-04-23 18:54:44 -07:00
Jacob Hoffman-Andrews 4a2029b293
Use explicit fmt.Sprintf for ProblemDetails (#4787)
In #3708, we added formatters for the the convenience methods in the
`probs` package.

However, in #4783, @alexzorin pointed out that we were incorrectly
passing an error message through fmt.Sprintf as the format parameter
rather than as a value parameter.

I proposed a fix in #4784, but during code review we concluded that the
underlying problem was the pattern of using format-style functions that
don't have some variant of printf in the name. That makes this wrong:
`probs.DNS(err.Error())`, and this right: `probs.DNS("%s", err)`. Since
that's an easy mistake to make and a hard one to spot during code review,
we're going to stop using this particular pattern and call `fmt.Sprintf`
directly.

This PR reverts #3708 and adds some `fmt.Sprintf` where needed.
2020-04-21 14:36:11 -07:00
Jacob Hoffman-Andrews 2d7337dcd0
Remove newlines from log messages. (#4777)
Since Boulder's log system adds checksums to lines, but log-validator
processes entries on a per-line basis, including newlines in log
messages can cause a validation failure.
2020-04-16 16:49:08 -07:00
Jacob Hoffman-Andrews bc528cf8cd
Error when redirect target is too long. (#4775)
This can happen when a misconfiguration redirects a certain path to
itself, doubled. After 10 redirects the error message can get quite
long. Instead we halt things at 2000 bytes, which should be more than
enough.
2020-04-15 13:44:26 -07:00
Jacob Hoffman-Andrews 72deb5b798
gofmt code with -s (simplify) flag (#4763)
Found by golangci-lint's `gofmt` linter.
2020-04-08 17:25:35 -07:00
Jacob Hoffman-Andrews 75024c3ec1
Replace clock.Default() with clock.New() (#4761)
clock.Default is deprecated:
https://godoc.org/github.com/jmhodges/clock#Default
2020-04-08 17:23:43 -07:00
Jacob Hoffman-Andrews cdb0bddbd8
Prefix error names with "Err" (#4755)
Staticcheck cleanup: https://staticcheck.io/docs/checks#ST1012
2020-04-08 17:19:35 -07:00
Jacob Hoffman-Andrews 27e785f3f2
VA: Add "During secondary validation:" error prefix. (#4677)
This should make it easier to distinguish errors that are triggered by
remote failures rather than local ones.
2020-02-14 14:00:08 -05:00
Daniel McCarney f1894f8d1d
tidy: typo fixes flagged by codespell (#4634) 2020-01-07 14:01:26 -05:00
Roland Bracewell Shoemaker 5b2f11e07e Switch away from old style statsd metrics wrappers (#4606)
In a handful of places I've nuked old stats which are not used in any alerts or dashboards as they either duplicate other stats or don't provide much insight/have never actually been used. If we feel like we need them again in the future it's trivial to add them back.

There aren't many dashboards that rely on old statsd style metrics, but a few will need to be updated when this change is deployed. There are also a few cases where prometheus labels have been changed from camel to snake case, dashboards that use these will also need to be updated. As far as I can tell no alerts are impacted by this change.

Fixes #4591.
2019-12-18 11:08:25 -05:00
Daniel McCarney 6ed4ce23a8
bdns: move logDNSError to exchangeOne, log ErrId specially. (#4553)
We've found we need the context offered from logging the error closer to when it
happens in the `bdns` package rather than in the `va`. Adopting the function
requires adapting it slightly. Specifically in the new location we know it won't
be called with any timeout results, with a non-dns error, or with a nil
underlying error.

Having the logging done in `bdns` (and specifically from `exchangeOne`) also
lets us log the wire format of the query and response when we get a `dns.ErrId`
error indicating a query/response ID mismatch. A small unit test is included
that ensures the logging happens as expected.

In case it proves useful for matching against other metrics the DNS ID mismatch
error case also now increments a dedicated prometheus counter vector stat,
`dns_id_mismatch`. The stat is labelled by resolver and query type.

Resolves https://github.com/letsencrypt/boulder/issues/4532
2019-11-15 16:03:45 -05:00
Jacob Hoffman-Andrews 7f6caddc5b VA: log internal DNS errors. (#4520)
When we get a DNS error that has an internal cause (like connection
refused), we return a generic message like "networking error" to the
user to avoid revealing details that would be confusing. However, when
debugging problems with our own services, it's useful to have the
underlying errors.

This adds a helper method in the VA and calls it from each place we use
DNS errors.
2019-11-04 09:09:24 -05:00
Daniel McCarney 7b60b57c33
va: log account ID in multi VA differential JSON. (#4521)
This will reduce the amount of analysis time required to identify
large integrators that aren't compatible with multi VA.
2019-10-31 13:12:28 -04:00
Daniel McCarney 2926074a29
CI/Dev: enable TLS 1.3 (#4489)
Also update the VA's TLS-ALPN-01 TLS 1.3 unit test to not expect
a failure.
2019-10-17 14:01:38 -04:00