Remove the .Error() method from probs.ProblemDetails, so that it can no
longer be returned from functions which return an error. Update various
call sites to use the .String() method to get a textual representation
of the problem instead. Simplify ProblemDetailsForError to not
special-case and pass-through ProblemDetails, since they are no longer a
valid input to that function.
This reduces instances of "boxed nil" bugs, and paves the way for all of
the WFE methods to be refactored to simply return errors instead of
writing them directly into the response object.
Part of https://github.com/letsencrypt/boulder/issues/4980
Add `identifier` fields, which will soon replace the `dnsName` fields,
to:
- `corepb.Authorization`
- `corepb.Order`
- `rapb.NewOrderRequest`
- `sapb.CountFQDNSetsRequest`
- `sapb.CountInvalidAuthorizationsRequest`
- `sapb.FQDNSetExistsRequest`
- `sapb.GetAuthorizationsRequest`
- `sapb.GetOrderForNamesRequest`
- `sapb.GetValidAuthorizationsRequest`
- `sapb.NewOrderRequest`
Populate these `identifier` fields in every function that creates
instances of these structs.
Use these `identifier` fields instead of `dnsName` fields (at least
preferentially) in every function that uses these structs. When crossing
component boundaries, don't assume they'll be present, for
deployability's sake.
Deployability note: Mismatched `cert-checker` and `sa` versions will be
incompatible because of a type change in the arguments to
`sa.SelectAuthzsMatchingIssuance`.
Part of #7311
Add a new "MPICFullResults" feature flag. When this flag is enabled in
the VA, it will wait for all Remote VAs to return their results for both
Domain Control Validation and CAA checking, rather than short-circuiting
as soon as it has seen enough results to know whether corroboration will
or will not be achieved.
We make this change because waiting for these to return honestly doesn't
take that long, because we do validation (although not CAA rechecking)
asynchronously, and because it improves the quality of our MPIC quorum
summary logs (so we don't always say only 3/4 concurred because the
fourth was cancelled).
Fixes https://github.com/letsencrypt/boulder/issues/7809
Add an `identifier` field to the `va.PerformValidationRequest` proto, which will soon replace its `dnsName` field.
Accept and prefer the `identifier` field in every VA function that uses this struct. Don't (yet) assume it will be present.
Throughout the VA, accept and handle the IP address identifier type. Handling is similar to DNS names, except that `getAddrs` is not called, and consider that:
- IPs are represented in a different field in the `x509.Certificate` struct.
- IPs must be presented as reverse DNS (`.arpa`) names in SNI for [TLS-ALPN-01 challenge requests](https://datatracker.ietf.org/doc/html/rfc8738#name-tls-with-application-layer-).
- IPv6 addresses are enclosed in square brackets when composing or parsing URLs.
For HTTP-01 challenges, accept redirects to bare IP addresses, which were previously rejected.
Fixes#2706
Part of #7311
Replace DCV and CAA checks (PerformValidation and IsCAAValid) in
va/va.go and va/caa.go with their MPIC compliant counterparts (DoDCV and
DoCAA) in va/vampic.go. Deprecate EnforceMultiCAA and EnforceMPIC and
default code paths as though they are both true. Require that RIR and
Perspective be set for primary and remote VAs.
Fixes#7965Fixes#7819
Today, we have VA.PerformValidation, a method called by the RA at
challenge time to perform DCV and check CAA. We also have VA.IsCAAValid,
a method invoked by the RA at finalize time when a CAA re-check is
necessary. Both of these methods can be executed on remote VA
perspectives by calling the generic VA.performRemoteValidation.
This change splits VA.PerformValidation into VA.DoDCV and VA.DoCAA,
which are both called on remote VA perspectives by calling the generic
VA.doRemoteOperation. VA.DoDCV, VA.DoCAA, and VA.doRemoteOperation
fulfill the requirements of SC-067 V3: Require Multi-Perspective
Issuance Corroboration by:
- Requiring at least three distinct perspectives, as outlined in the
"Phased Implementation Timeline" in BRs section 3.2.2.9 ("Effective
March 15, 2025").
- Ensuring that the number of non-corroborating (failing) perspectives
remains below the threshold defined by the "Table: Quorum Requirements"
in BRs section 3.2.2.9.
- Ensuring that corroborating (passing) perspectives reside in at least
2 distinct Regional Internet Registries (RIRs) per the "Phased
Implementation Timeline" in BRs section 3.2.2.9 ("Effective March 15,
2026").
- Including an MPIC summary consisting of: passing perspectives, failing
perspectives, passing RIRs, and a quorum met for issuance (e.g., 2/3 or
3/3) in each validation audit log event, per BRs Section 5.4.1,
Requirement 2.8.
When the new SeparateDCVAndCAAChecks feature flag is enabled on the RA,
calls to VA.IsCAAValid (during finalization) and VA.PerformValidation
(during challenge) are replaced with calls to VA.DoCAA and a sequence of
VA.DoDCV followed by VA.DoCAA, respectively.
Fixes#7612Fixes#7614Fixes#7615Fixes#7616
- Ensure the Perspective and RIR reported by each remoteVA in the
*vapb.ValidationResult returned by VA.PerformValidation, matches the
expected local configuration when that configuration is present.
- Correct "AfriNIC" to "AFRINIC", everywhere.
Part of https://github.com/letsencrypt/boulder/issues/7819
- Remove undeployed feature flag MultiCAAFullResults
- Perform local CAA checks prior to initiating remote checks, instead of
starting remote checks and proceeding to perform local checks.
- Remove VA.IsCAAValid specific remote validation logic, use
VA.performRemoteOperation instead
- Refactor va.logRemoteResults to be easier to test and omit the RVA
problem
- Drive-by fix: Calculate logEvent.Latency with va.clk.Since() instead
of time.Since() like everything else in VA.performRemoteOperation
- Make performRemoteValidation a more generic function that returns a
new remoteResult interface
- Modify the return value of IsCAAValid and PerformValidation to satisfy
the remoteResult interface
- Include compile time checks and tests that pass an arbitrary operation
- Make the primary VA aware of the expected Perspective and RIR of each
remote VA.
- All Perspectives should be unique, have the primary VA check for
duplicate Perspectives at startup.
- Update test setup functions to ensure that each remote VA client and
corresponding inmem impl have a matching perspective and RIR.
Part of #7819
- Remove Perspective and RIR from ValidationRecords
- Make ValidationResultToPB Perspective and RIR aware
- Update comment for VA.PerformValidation
- Make verificationRequestEvent more like doDCVAuditLog
- Update language used in problems created by performRemoteValidation to
be more like doRemoteDCV.
Previously this was a configuration field.
Ports `maxAllowedFailures()` from `determineMaxAllowedFailures()` in
#7794.
Test updates:
Remove the `maxRemoteFailures` param from `setup` in all VA tests.
Some tests were depending on setting this param directly to provoke
failures.
For example, `TestMultiVAEarlyReturn` previously relied on "zero allowed
failures". Since the number of allowed failures is now 1 for the number
of remote VAs we were testing (2), the VA wasn't returning early with an
error; it was succeeding! To fix that, make sure there are two failures.
Since two failures from two RVAs wouldn't exercise the right situation,
add a third RVA, so we get two failures from three RVAs.
Similarly, TestMultiCAARechecking had several test cases that omitted
this field, effectively setting it to zero allowed failures. I updated
the "1 RVA failure" test case to expect overall success and added a "2
RVA failures" test case to expect overall failure (we previously
expected overall failure from a single RVA failing).
In TestMultiVA I had to change a test for `len(lines) != 1` to
`len(lines) == 0`, because with more backends we were now logging more
errors, and finding e.g. `len(lines)` to be 2.
- Add a new histogram, validationLatency
- Add a VA.observeLatency for observing validation latency
- Refactor to ensure this metric can be observed exclusively within
VA.PerformValidation and VA.IsCAAValid.
- Replace validationTime, localValidationTime, remoteValidationTime,
remoteValidationFailures, caaCheckTime, localCAACheckTime,
remoteCAACheckTime, and remoteCAACheckFailures with validationLatency
- Add `Perspective` and `RIR` fields to the remote-va configuration
- Configure RVA ValidationAuthorityImpl instances with the contents of
the JSON configuration
- Configure VA ValidationAuthorityImpl instances with the constant
`va.PrimaryPerspective`
- Log `Perspective` for non-Primary Perspectives, per the MPIC
requirements in section 5.4.1 (2) vii of the BRs. Also log the RIR for
posterity.
- Introduce `ValidationResult` RPC fields `Perspective` and `Rir`, which
are not currently used but will be required for corroboration in #7616
Fixes https://github.com/letsencrypt/boulder/issues/7613
Part of https://github.com/letsencrypt/boulder/issues/7615
Part of https://github.com/letsencrypt/boulder/issues/7616
Clean up how we handle identifiers throughout the Boulder codebase by
- moving the Identifier protobuf message definition from sa.proto to
core.proto;
- adding support for IP identifier to the "identifier" package;
- renaming the "identifier" package's exported names to be clearer; and
- ensuring we use the identifier package's helper functions everywhere
we can.
This will make future work to actually respect identifier types (such as
in Authorization and Order protobuf messages) simpler and easier to
review.
Part of https://github.com/letsencrypt/boulder/issues/7311
Find all gRPC fields which represent DNS Names -- sometimes called
"identifier", "hostname", "domain", "identifierValue", or other things
-- and unify their naming. This naming makes it very clear that these
values are strings which may be included in the SAN extension of a
certificate with type dnsName.
As we move towards issuing IP Address certificates, all of these fields
will need to be replaced by fields which carry both an identifier type
and value, not just a single name. This unified naming makes it very
clear which messages and methods need to be updated to support
non-dnsName identifiers.
Part of https://github.com/letsencrypt/boulder/issues/7647
Replace all of Boulder's usage of the Go stdlib "math/rand" package with
the newer "math/rand/v2" package which first became available in go1.22.
This package has an improved API and faster performance across the
board.
See https://go.dev/blog/randv2 and https://go.dev/blog/chacha8rand for
details.
Change the VA to perform remote validation wholly after local validation
and CAA checks, and to do so only if those local checks pass. This will
likely increase the latency of our successful validations, by making
them less parallel. However, it will reduce the amount of work we do on
unsuccessful validations, and reduce their latency, by not kicking off
and waiting for remote results.
Fixes https://github.com/letsencrypt/boulder/issues/7509
The core.Challenge.ProvidedKeyAuthorization field is problematic, both
because it is poorly named (which is admittedly easily fixable) and
because it is a field which we never expose to the client yet it is held
on a core type. Deprecate this field, and replace it with a new
vapb.PerformValidationRequest.ExpectedKeyAuthorization field.
Within the VA, this also simplifies the primary logic methods to just
take the expected key authorization, rather than taking a whole (largely
unnecessary) challenge object. This has large but wholly mechanical
knock-on effects on the unit tests.
While we're here, improve the documentation on core.Challenge itself,
and remove Challenge.URI, which was deprecated long ago and is wholly
unused.
Part of https://github.com/letsencrypt/boulder/issues/7514
These flags have been true and false, respectively, for years. We do not
expect to change them at any time in the future, and their continued
existence makes certain parts of the VA code significantly more complex.
Remove all references to them, preserving behavior in the "enforce, but
not full results" configuration.
IN-10358 tracks the corresponding config changes
Replaced our embeds of foopb.UnimplementedFooServer with
foopb.UnsafeFooServer. Per the grpc-go docs this reduces the "forwards
compatibility" of our implementations, but that is only a concern for
codebases that are implementing gRPC interfaces maintained by third
parties, and which want to be able to update those third-party
dependencies without updating their own implementations in lockstep.
Because we update our protos and our implementations simultaneously, we
can remove this safety net to replace runtime type checking with
compile-time type checking.
However, that replacement is not enough, because we never pass our
implementation objects to a function which asserts that they match a
specific interface. So this PR also replaces our reflect-based unittests
with idiomatic interface assertions. I do not view this as a perfect
solution, as it relies on people implementing new gRPC servers to add
this line, but it is no worse than the status quo which relied on people
adding the "TestImplementation" test.
Fixes https://github.com/letsencrypt/boulder/issues/7497
When we present error messages containing IP addresses to end users,
sometimes we're presenting the IP at which we began a chain of
redirects, but not presenting the IP at which the final redirect was
located. This can make it difficult for client operators to identify
exactly which server in their infrastructure is misbehaving.
Change our IP error messages to reference the most-recently-tried
address from the set of all validation records, rather than the address
which started the current (potential) redirect chain.
Fixes https://github.com/letsencrypt/boulder/issues/7347
Add a new field to the structured JSON object logged by the VA
indicating whether the HTTP-01 or TLS-ALPN-01 requests ended up
negotiating a TLS cipher suite which uses RSA key exchange. This is
useful for measuring how many servers we reach out to are RSA-only, so
we can determine the deprecation timeline for RSA key exchange (which
has been removed from go1.22).
The go TLS library always prefers ECDHE key exchange over RSA, so we
should only be negotiating RSA key exchange if the server we're reaching
out to doesn't support ECDHE at all.
Part of https://github.com/letsencrypt/boulder/issues/7321
Remove three deprecated feature flags which have been removed from all
production configs:
- StoreLintingCertificateInsteadOfPrecertificate
- LeaseCRLShards
- AllowUnrecognizedFeatures
Deprecate three flags which are set to true in all production configs:
- CAAAfterValidation
- AllowNoCommonName
- SHA256SubjectKeyIdentifier
IN-9879 tracked the removal of these flags.
This allows us to defer creating the user-friendly ProblemDetails to the
highest level (va.PerformValidation), which in turn makes it possible to
log the original error alongside the user-friendly error. It also
reduces the likelihood of "boxed nil" bugs.
Many of the unittests check for a specific ProblemDetails.Type and
specific Details contents. These test against the output of
`detailedError`, which transforms `error` into `ProblemDetails`. So the
updates to the tests include insertion of `detailedError(err)` in many
places.
Several places that were returning a specific ProblemDetails.Type
instead return the corresponding `berrors` type. This follows a pattern
that `berrors` was designed to enable: use the `berrors` types
internally and transform into `ProblemDetails` at the edge where we are
rendering something to present to the user: WFE, and now VA.
In the current code, when `processRemoteCAAResults` is called, its
`primaryResult` parameter is always set to `nil`. So we can simplify by
removing that parameter.
Previously, `va.IsCAAValid` would only check CAA records from the
primary VA during initial domain control validation, completely ignoring
any configured RVAs. The upcoming
[MPIC](https://github.com/ryancdickson/staging/pull/8) ballot will
require that it be done from multiple perspectives. With the currently
deployed [Multi-Perspective
Validation](https://letsencrypt.org/2020/02/19/multi-perspective-validation.html)
in staging and production, this change brings us in line with the
[proposed phase
3](https://github.com/ryancdickson/staging/pull/8/files#r1368708684).
This change reuses the existing
[MaxRemoteValidationFailures](21fc191273/cmd/boulder-va/main.go (L35))
variable for the required non-corroboration quorum.
> Phase 3: June 15, 2025 - December 14, 2025 ("CAs MUST implement MPIC
in blocking mode*"):
>
> MUST implement MPIC? Yes
> Required quorum?: Minimally, 2 remote perspectives must be used. If
using less than 6 remote perspectives, 1 non-corroboration is allowed.
If using 6 or more remote perspectives, 2 non-corroborations are
allowed.
> MUST block issuance if quorum is not met: Yes.
> Geographic diversity requirements?: Perspectives must be 500km from 1)
the primary perspective and 2) all other perspectives used in the
quorum.
>
> * Note: "Blocking Mode" is a nickname. As opposed to "monitoring mode"
(described in the last milestone), CAs MUST NOT issue a certificate if
quorum requirements are not met from this point forward.
Adds new VA feature flags:
* `EnforceMultiCAA` instructs a primary VA to command each of its
configured RVAs to perform a CAA recheck.
* `MultiCAAFullResults` causes the primary VA to block waiting for all
RVA CAA recheck results to arrive.
Renamed `va.logRemoteValidationDifferentials` to
`va.logRemoteDifferentials` because it can handle initial domain control
validations and CAA rechecking with minimal editing.
Part of https://github.com/letsencrypt/boulder/issues/7061
Replace the current three-piece setup (enum of feature variables, map of
feature vars to default values, and autogenerated bidirectional maps of
feature variables to and from strings) with a much simpler one-piece
setup: a single struct with one boolean-typed field per feature. This
preserves the overall structure of the package -- a single global
feature set protected by a mutex, and Set, Reset, and Enabled methods --
although the exact function signatures have all changed somewhat.
The executable config format remains the same, so no deployment changes
are necessary. This change does deprecate the AllowUnrecognizedFeatures
feature, as we cannot tell the json config parser to ignore unknown
field names, but that flag is set to False in all of our deployment
environments already.
Fixes https://github.com/letsencrypt/boulder/issues/6802
Fixes https://github.com/letsencrypt/boulder/issues/5229
Add a new feature flag "CAAAfterValidation" which, when set to true in
the VA, causes the VA to only begin CAA checks after basic domain
control validation has completed successfully. This will make successful
validations take longer, since the DCV and CAA checks are performed
serially instead of in parallel. However, it will also reduce the number
of CAA checks we perform by up to 80%, since such a high percentage of
validations also fail.
IN-9575 tracks enabling this feature flag in staging and prod
Fixes https://github.com/letsencrypt/boulder/issues/7058
Make minor, non-user-visible changes to how we structure the probs
package. Notably:
- Add new problem types for UnsupportedContact and
UnsupportedIdentifier, which are specified by RFC8555 and which we will
use in the future, but haven't been using historically.
- Sort the problem types and constructor functions to match the
(alphabetical) order given in RFC8555.
- Rename some of the constructor functions to better match their
underlying problem types (e.g. "TLSError" to just "TLS").
- Replace the redundant ProblemDetailsToStatusCode function with simply
always returning a 500 if we haven't properly set the problem's
HTTPStatus.
- Remove the ability to use either the V1 or V2 error namespace prefix;
always use the proper RFC namespace prefix.
Remove tracing using Beeline from Boulder. The only remnant left behind
is the deprecated configuration, to ensure deployability.
We had previously planned to swap in OpenTelemetry in a single PR, but
that adds significant churn in a single change, so we're doing this as
multiple steps that will each be significantly easier to reason about
and review.
Part of #6361
Remove the PortConfig field from both the VA's config struct and from
the NewValidationAuthorityImpl constructor. This config item is no
longer used anywhere, and removing this prevents us from accidentally
overriding the "Authorized Ports" (80 and 443) which are required by the
Baseline Requirements.
Unit tests are still able to override the httpPort and tlsPort fields of
the ValidationAuthorityImpl.
Fixes#3940
This avoids serialization errors passing through gRPC.
Also, add a pass-through path in replaceInvalidUTF8 that saves an
allocation in the trivial case.
Fixes#6490
Make minor changes to our implementation of CAA Account and Method
Binding, as a result of reviewing the code in preparation for enabling
it in production. Specifically:
- Ensure that the validation method and account ID are included at the
request level, rather than waiting until we perform the checks which use
those parameters;
- Clean up code which assumed the validation method and account ID might
not be populated;
- Use the core.AcmeChallenge type (rather than plain string) for the
validation method everywhere;
- Update comments to reference the latest version and correct sections
of the CAA RFCs; and
- Remove the CAA feature flags from the config integration tests to
reflect that they are not yet enabled in prod.
I have reviewed this code side-by-side with RFC 8659 (CAA) and RFC 8657
(ACME CAA Account and Method Binding) and believe it to be compliant
with both.
Enable the "unparam" linter, which checks for unused function
parameters, unused function return values, and parameters and
return values that always have the same value every time they
are used.
In addition, fix many instances where the unparam linter complains
about our existing codebase. Remove error return values from a
number of functions that never return an error, remove or use
context and test parameters that were previously unused, and
simplify a number of (mostly test-only) functions that always take the
same value for their parameter. Most notably, remove the ability to
customize the RSA Public Exponent from the ceremony tooling,
since it should always be 65537 anyway.
Fixes#6104
We have decided that we don't like the if err := call(); err != nil
syntax, because it creates confusing scopes, but we have not cleaned up
all existing instances of that syntax. However, we have now found a
case where that syntax enables a bug: It caused readers to believe that
a later err = call() statement was assigning to an already-declared err
in the local scope, when in fact it was assigning to an
already-declared err in the parent scope of a closure. This caused our
ineffassign and staticcheck linters to be unable to analyze the
lifetime of the err variable, and so they did not complain when we
never checked the actual value of that error.
This change standardizes on the two-line error checking syntax
everywhere, so that we can more easily ensure that our linters are
correctly analyzing all error assignments.
Add `stylecheck` to our list of lints, since it got separated out from
`staticcheck`. Fix the way we configure both to be clearer and not
rely on regexes.
Additionally fix a number of easy-to-change `staticcheck` and
`stylecheck` violations, allowing us to reduce our number of ignored
checks.
Part of #5681
Add Honeycomb tracing to all Boulder components which act as
HTTP servers, gRPC servers, or gRPC clients. Add many values
which we currently emit to logs to the trace spans. Add a way to
configure the Honeycomb integration to our config files, and by
default configure all of our tests to "mute" (send nothing).
Followup changes will refine the configuration, attempt to reduce
the new dependency load, and introduce better sampling.
Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218
protoc now generates grpc code in a separate file from protobuf code.
Also, grpc servers are now required to embed an "unimplemented"
interface from the generated .pb.go file, which provides forward
compatibility.
Update the generate.go files since the invocation for protoc has changed
with the split into .pb.org and _grpc.pb.go.
Fixes#5368
Delete the ValidationAuthorityGRPCServer and ...GRPCClient structs,
and update references to instead reference the underlying vapb.VAClient
type directly. Also delete the core.ValidationAuthority interface.
Does not require updating interfaces elsewhere, as the client
wrapper already included the variadic grpc.CallOption parameter.
Fixes#5325