We have upgraded to go1.24.1 in production, and no longer need to test
go1.23.x. Updating the version in our go.mod also allows us to begin
using x509.Certificate.Policies instead of .PolicyIdentifiers.
Add a new "MPICFullResults" feature flag. When this flag is enabled in
the VA, it will wait for all Remote VAs to return their results for both
Domain Control Validation and CAA checking, rather than short-circuiting
as soon as it has seen enough results to know whether corroboration will
or will not be achieved.
We make this change because waiting for these to return honestly doesn't
take that long, because we do validation (although not CAA rechecking)
asynchronously, and because it improves the quality of our MPIC quorum
summary logs (so we don't always say only 3/4 concurred because the
fourth was cancelled).
Fixes https://github.com/letsencrypt/boulder/issues/7809
When we receive an update-account request which is not empty, but
doesn't contain the "contact" field, don't assume that they want to
remove their contacts. Only remove contacts if the "contact" field is
present, but empty.
Add a unit test and an integration test which will catch regressions in
this behavior.
PR #8018 integrated the email-exporter service with WFE, updating
wfe.NewAccount and wfe.updateAccount to submit valid email contacts to
the Salesforce Pardot API. However, our new_or_updated_contact metric
shows that (account) contact updates currently exceed the highest
Salesforce tier’s daily submission limit by several times.
This change can be reverted if additional filtering logic reduces
updated (+ new) account contacts below the daily submission limit.
Remove crl-updater from the list of services run by startservers.py, so
that it isn't running at the same time as the crl-updater instances run
by specific integration tests. In return, add a new integration test
which starts crl-updater and waits for it to listen on its debug port,
just like startservers does.
Also make the existing crl-updater integration tests more robust and
more parallelizable by having them always reset the leasedUntil column
before executing the updater, instead of requiring each individual test
to perform that reset.
Fixes https://github.com/letsencrypt/boulder/issues/7590
Remove some RA tests that were checking for errors specific to the split
issuance flow. Make one of the tests test GetSCTs directly, which makes
for a much nicer test!
Add a new boulder service, email-exporter, which uses the Pardot API
client added in #8016 and the email.Exporter gRPC service added in
#8017.
Add pardot-test-srv, a test-only service for mocking communication with
Salesforce OAuth and Pardot APIs in non-production environments. Since
Salesforce does not provide Pardot functionality in developer sandboxes,
pardot-test-srv must run in all non-production environments (e.g.,
sre-development and staging).
Integrate the email-exporter service with the WFE and modify
WFE.NewAccount and WFE.UpdateAccount to submit valid email contacts.
Ensure integration tests verify that contacts eventually reach
pardot-test-srv.
Update configuration where necessary to:
- Build pardot-test-srv as a standalone binary.
- Bring up pardot-test-srv and cmd/email-exporter for integration
testing.
- Integrate WFE with cmd/email-exporter when running test/config-next.
Closes#7966
Compute the width of the ARI suggested renewal window as 2% of the
validity period. This means that 90-day certificates have their
suggested window shrink slightly from 48 hours to 43.2 hours, and gives
six-day (160h) certs a suggested window 3.2 hours wide.
Also move the center of that window to the midpoint of the certificate
validity period for certs which are valid for less than 10 days, so that
operators have (proportionally) a little more time to respond to renewal
issues.
Fixes https://github.com/letsencrypt/boulder/issues/7996
In a few places within the SA, we use explicit transactions to wrap
read-then-update style operations. Because we set the transaction
isolation level on a per-session basis, these transactions do not in
fact change their isolation level, and therefore generally remain at the
default isolation level of REPEATABLE READ.
Unfortunately, we cannot resolve this simply by converting the SELECT
statements into SELECT...FOR UPDATE statements: although this would fix
the issue by making those queries into locking statements, it also
triggers what appears to be an InnoDB bug when many transactions all
attempt to select-then-insert into a table with both a primary key and a
separate unique key, as the crlShards table has. This causes the
integration tests in GitHub Actions, which run with an empty database
and therefore use the needToInsert codepath instead of the update
codepath, to consistently flake.
Instead, resolve the issue by having the UPDATE statements specify that
the value of the leasedUntil column is still the same as was read by the
initial SELECT. Although two crl-updaters may still attempt these
transactions concurrently, the UPDATE statements will still be fully
sequenced, and the latter one will fail.
Part of https://github.com/letsencrypt/boulder/issues/8031
Replace DCV and CAA checks (PerformValidation and IsCAAValid) in
va/va.go and va/caa.go with their MPIC compliant counterparts (DoDCV and
DoCAA) in va/vampic.go. Deprecate EnforceMultiCAA and EnforceMPIC and
default code paths as though they are both true. Require that RIR and
Perspective be set for primary and remote VAs.
Fixes#7965Fixes#7819
Add MaxNames to the set of things that can be configured on a
per-profile basis. Remove all references to the RA's global maxNames,
replacing them with reference's to the current profile's maxNames. Add
code to the RA's main() to copy a globally-configured MaxNames into each
profile, for deployability.
Also remove any understanding of MaxNames from the WFE, as it is
redundant with the RA and is not configured in staging or prod. Instead,
hardcode the upper limit of 100 into the ratelimit package itself.
Fixes https://github.com/letsencrypt/boulder/issues/7993
Add a new RPC to the CA: `IssueCertificate` covers issuance of both the
precertificate and the final certificate. In between, it calls out to
the RA's new method `GetSCTs`.
The RA calls the new `CA.IssueCertificate` if the `UnsplitIssuance`
feature flag is true.
The RA had a metric that counted certificates by profile name and hash.
Since the RA doesn't receive a profile hash in the new flow, simply
record the total number of issuances.
Fixes https://github.com/letsencrypt/boulder/issues/7983
Deprecate the "InsertAuthzsIndividually" feature flag, which has been
set to true in both Staging and Production. Delete the code guarded
behind that flag being false, namely the ability of the MultiInserter to
return the newly-created IDs from all of the rows it has inserted. This
behavior is being removed because it is not supported in MySQL / Vitess.
Fixes https://github.com/letsencrypt/boulder/issues/7718
---
> [!WARNING]
> ~~Do not merge until IN-10737 is complete~~
Update from go1.23.1 to go1.23.6 for our primary CI and release builds.
This brings in a few security fixes that aren't directly relevant to us.
Add go1.24.0 to our matrix of CI and release versions, to prepare for
switching to this next major version in prod.
These paths receive (literally) zero traffic, and they require the WFE
to duplicate the RA's authorization lifetime configuration. Since that
configuration is now per-profile, the WFE can no longer easily replicate
it, and the resulting staleness calculations will be wrong. Remove the
duplicated configuration, remove the unused endpoints that rely on it,
and remove the staleness-checking code which supported those endpoints.
Leave the non-ACME /get/ endpoint for certificates in place, because
checking staleness for those does not require any additional
configuration, and having a non-ACME serial-based API for certificates
is a good thing.
Fixes https://github.com/letsencrypt/boulder/issues/8007
The crl-storer passes along Cache-Control and Expires from the
crl-updater (because the crl-updater knows the UpdatePeriod).
The crl-updater calculates the Expires header based on when it expects
to update the CRL, plus a margin of error.
Fixes#8004
Increase pkilint timeout from 200ms to 2s. In #8006 I found that errors
were stemming from timeouts talking to the bpkilint container. These
probably showed up in TestRevocation particularly because that
integration test now issues for many certificates in parallel. Pkilint's
slowness, combined with the relatively small number of cores in CI,
probably resulted in some requests taking too long.
Remove the deprecated WFE & nonce config field `NoncePrefixKey`, which
has been replaced by `NonceHMACKey`.
<del>DO NOT MERGE until:</del>
- <del>#7793 (in `release-2024-11-18`) has been deployed, AND:</del>
- <del>`NoncePrefixKey` has been removed from all running configs.</del>
Fixes#7632
When we turn on explicit sharding, we'll change the CA serial prefix, so
we can know that all issuance from the new prefixes uses explicit
sharding, and all issuance from the old prefixes uses temporal sharding.
This lets us avoid putting a revoked cert in two different CRL shards
(the temporal one and the explicit one).
To achieve this, the crl-updater gets a list of temporally sharded
serial prefixes. When it queries the `certificateStatus` table by date
(`GetRevokedCerts`), it will filter out explicitly sharded certificates:
those that don't have their prefix on the list.
Part of #7094
Add three new fields to the ra.ValidationProfile structure, representing
the profile's pending authorization lifetime (used to assign an
expiration when a new authz is created), valid authorization lifetime
(used to assign an expiration when an authz is successfully validated),
and order lifetime (used to assign an expiration when a new order is
created). Remove the prior top-level fields which controlled these
values across all orders.
Add a "defaultProfileName" field to the RA as well, to facilitate
looking up a default set of lifetimes when the order doesn't specify a
profile. If this default name is explicitly configured, always provide
it to the CA when requesting issuance, so we don't have to duplicate the
default between the two services.
Modify the RA's config struct in a corresponding way: add three new
fields to the ValidationProfiles structure, and deprecate the three old
top-level fields. Also upgrade the ra.NewValidationProfile constructor
to handle these new fields, including doing validation on their values.
Fixes https://github.com/letsencrypt/boulder/issues/7605
The CRLDP is included only when the profile's
IncludeCRLDistributionPoints field is true.
Introduce a new config field for issuers, CRLShards. If
IncludeCRLDistributionPoints is true and this is zero, issuance will
error.
The CRL shard is assigned at issuance time based on the (random) low
bits of the serial number.
Part of https://github.com/letsencrypt/boulder/issues/7094
The current profiles draft
(https://datatracker.ietf.org/doc/draft-aaron-acme-profiles/00/) says:
> If a server receives a request to finalize an Order whose profile the
> CA is no longer willing to issue under, it MUST respond with a
> problem document of type "invalidProfile". The server SHOULD attempt
> to avoid this situation, e.g. by ensuring that all Orders for a
> profile have expired before it stops issuing under that profile.
Add types and helper functions representing this new error type to the
berrors, probs, and web packages. Update the WFE code which rejects
new-order requests with unrecognized profiles to use these new types,
and add similar code to the WFE's finalize path. Update the unit and
integration tests to reflect the fact that we now configure at least one
profile in both Staging and Prod (tracked in IN-10574).
We've been seeing some flaky integration tests where issuance fails. The
integration test only has access to the generic user-facing error. The
real error is available as `InternalError` in the WFE logs, but we need
a higher log level to see it.
In revocation_test.go, fetch all CRLs, and look for revoked certificates
on both CRLs and OCSP.
Make s3-test-srv listen on all interfaces, so the CRL URLs in the CA
config work.
Add IssuerNameIDs to the CRL URLs in ca.json, to match how those CRLs
are uploaded to S3.
Make TestRevocation parallel. Speedup from ~60s to ~3s.
Increase ocsp-responder's allowed parallelism to account for parallel
test. Also, add "maxInflightSignings" to config/ since it's in prod.
"maxSigningWaiters" is not yet in prod, so don't move that field.
Add a mutex around running crl-updater, and decrease the log level so
errors stand out more when they happen.
Remove the singular Profile field from the CA config, as it has been
replaced by the plural CertProfiles key. Remove the Expiry, Backdate,
LintConfig, and IgnoredLints keys from the top-level CA config, as they
are now also configured on a per-profile basis. Remove the LifespanCRL
key from the CA config, as it is now configured within the CRLProfile.
For all of the above, remove transitional fallbacks from within
//ca/main.go.
These config changes were deployed to production in IN-10568, IN-10506,
and IN-10045.
Fixes https://github.com/letsencrypt/boulder/issues/7414
Fixes https://github.com/letsencrypt/boulder/issues/7159
The admin-revoker tool is dead. Long live the admin tool.
There's a number places that still reference admin-revoker, including
Boulder's ipki and the revocation source in the database which are still
used, even if the tool is gone. But nothing actually using the tool.
The schema tool used to parse log_list_schema.json doesn't work well
with the updated schema. This is going to be required to support
static-ct-api logs from current Chrome log lists.
Instead, use the loglist3 package inside the certificate-transparency-go
project, which Boulder already uses for CT submission otherwise.
As well, the Log IDs and keys returned from loglist3 have already been
base64 decoded, so this re-encodes them to minimize the impact on the
rest of the codebase and keep this change small.
The test log_list.json file needed to be made a bit more realistic for
loglist3 to parse without base64 or date parsing errors.
Remove code using `certificatesPerName` & `newOrdersRL` tables.
Deprecate `DisableLegacyLimitWrites` & `UseKvLimitsForNewOrder` flags.
Remove legacy `ratelimit` package.
Delete these RA test cases:
- `TestAuthzFailedRateLimitingNewOrder` (rl:
`FailedAuthorizationsPerDomainPerAccount`)
- `TestCheckCertificatesPerNameLimit` (rl: `CertificatesPerDomain`)
- `TestCheckExactCertificateLimit` (rl: `CertificatesPerFQDNSet`)
- `TestExactPublicSuffixCertLimit` (rl: `CertificatesPerDomain`)
Rate limits in NewOrder are now enforced by the WFE, starting here:
5a9b4c4b18/wfe2/wfe.go (L781)
We collect a batch of transactions to check limits, check them all at
once, go through and find which one(s) failed, and serve the failure
with the Retry-After that's furthest in the future. All this code
doesn't really need to be tested again; what needs to be tested is that
we're returning the correct failure. That code is
`NewOrderLimitTransactions`, and the `ratelimits` package's tests cover
this.
The public suffix handling behavior is tested by
`TestFQDNsToETLDsPlusOne`:
5a9b4c4b18/ratelimits/utilities_test.go (L9)
Some other RA rate limit tests were deleted earlier, in #7869.
Part of #7671.
And in the RA, log the notBefore of the previous issuance.
To make this happen, I had to hoist the "check for previous certificate"
up a level into `issueCertificateOuter`. That meant I also had to hoist
the "split off a WithoutCancel context" logic all the way up to
`FinalizeOrder`.
This brings in several new and useful lints. It also brings in one CABF
BR lint which we have to ignore in our default profile which includes
the Subject Key Identifier extension:
"w_ext_subject_key_identifier_not_recommended_subscriber". In our modern
profile which omits several fields, we have to ignore the opposite
RFC5280 lint "w_ext_subject_key_identifier_missing_sub_cert".
Release notes: https://github.com/zmap/zlint/releases/tag/v3.6.4
Changelog: https://github.com/zmap/zlint/compare/v3.6.0...v3.6.4
Note that the majority of the ~400 file changes are merely copyright
date changes.
The corresponding production config changes tracked in IN-10466 are
complete.
The RA's DeactivateAccount method expects the account provided to it by
the WFE to still have status Valid. The new WFE deactivation code was
hardcoding the status to Deactivated. Fix the WFE to pass the account's
current status instead.
Add an integration test to confirm both the breakage and the fix. Also
leave behind some TODOs to simplify this codepath further, and not
require the status to be provided at all.
Part of #5554
webpki.go was discarding stdout when "make build" failed. We can make it
print stdout in that context, but it's more straightforward to run "make
build" from the shell script that calls webpki.go, where its stdout will
naturally be emitted.
Inspired by a recent CI run where there was a straightforward build
failure in some of Boulder's code, but it was masked by an error running
webpki.go in the `bsetup` container.
Remove `debugAddr` from the `admin` tool, which doesn't use it - or need
it, now that `newStatsRegistry` via `StatsAndLogging` doesn't require
it.
Remove `debugAddr` from `config-next/sfe.json`, as we usually set it on
the CLI instead.
Fixes#7838
Today, we have VA.PerformValidation, a method called by the RA at
challenge time to perform DCV and check CAA. We also have VA.IsCAAValid,
a method invoked by the RA at finalize time when a CAA re-check is
necessary. Both of these methods can be executed on remote VA
perspectives by calling the generic VA.performRemoteValidation.
This change splits VA.PerformValidation into VA.DoDCV and VA.DoCAA,
which are both called on remote VA perspectives by calling the generic
VA.doRemoteOperation. VA.DoDCV, VA.DoCAA, and VA.doRemoteOperation
fulfill the requirements of SC-067 V3: Require Multi-Perspective
Issuance Corroboration by:
- Requiring at least three distinct perspectives, as outlined in the
"Phased Implementation Timeline" in BRs section 3.2.2.9 ("Effective
March 15, 2025").
- Ensuring that the number of non-corroborating (failing) perspectives
remains below the threshold defined by the "Table: Quorum Requirements"
in BRs section 3.2.2.9.
- Ensuring that corroborating (passing) perspectives reside in at least
2 distinct Regional Internet Registries (RIRs) per the "Phased
Implementation Timeline" in BRs section 3.2.2.9 ("Effective March 15,
2026").
- Including an MPIC summary consisting of: passing perspectives, failing
perspectives, passing RIRs, and a quorum met for issuance (e.g., 2/3 or
3/3) in each validation audit log event, per BRs Section 5.4.1,
Requirement 2.8.
When the new SeparateDCVAndCAAChecks feature flag is enabled on the RA,
calls to VA.IsCAAValid (during finalization) and VA.PerformValidation
(during challenge) are replaced with calls to VA.DoCAA and a sequence of
VA.DoDCV followed by VA.DoCAA, respectively.
Fixes#7612Fixes#7614Fixes#7615Fixes#7616
- Ensure the Perspective and RIR reported by each remoteVA in the
*vapb.ValidationResult returned by VA.PerformValidation, matches the
expected local configuration when that configuration is present.
- Correct "AfriNIC" to "AFRINIC", everywhere.
Part of https://github.com/letsencrypt/boulder/issues/7819
Pending authz reuse is a nice-to-have feature because it allows us to
create fewer rows in the authz database table when creating new orders.
However, stats show that less than 2% of authorizations that we attach
to new orders are reused pending authzs. And as we move towards using a
more streamlined database schema to store our orders, authorizations,
and validation attempts, disabling pending authz reuse will greatly
simplify our database schema and code.
CPS Compliance Review: our CPS does not speak to whether or not we reuse
pending authorizations for new orders.
IN-10859 tracks enabling this flag in prod
Part of https://github.com/letsencrypt/boulder/issues/7715
Greatly simplify the two RA unit tests covering failed validations and
account+identifier pausing. Most importantly, directly manipulate the
ratelimit backing store during test setup, to avoid having to "perform"
extra validations.
Fixes https://github.com/letsencrypt/boulder/issues/7812
- Remove undeployed feature flag MultiCAAFullResults
- Perform local CAA checks prior to initiating remote checks, instead of
starting remote checks and proceeding to perform local checks.
- Remove VA.IsCAAValid specific remote validation logic, use
VA.performRemoteOperation instead
- Refactor va.logRemoteResults to be easier to test and omit the RVA
problem
- Drive-by fix: Calculate logEvent.Latency with va.clk.Since() instead
of time.Since() like everything else in VA.performRemoteOperation
- Make performRemoteValidation a more generic function that returns a
new remoteResult interface
- Modify the return value of IsCAAValid and PerformValidation to satisfy
the remoteResult interface
- Include compile time checks and tests that pass an arbitrary operation
- Make the primary VA aware of the expected Perspective and RIR of each
remote VA.
- All Perspectives should be unique, have the primary VA check for
duplicate Perspectives at startup.
- Update test setup functions to ensure that each remote VA client and
corresponding inmem impl have a matching perspective and RIR.
Part of #7819
- Remove Perspective and RIR from ValidationRecords
- Make ValidationResultToPB Perspective and RIR aware
- Update comment for VA.PerformValidation
- Make verificationRequestEvent more like doDCVAuditLog
- Update language used in problems created by performRemoteValidation to
be more like doRemoteDCV.
To prepare for the MPIC requirement of having a minimum of 3
perspectives, I added code to `NewValidationAuthorityImpl` to error if
there aren't enough remote VAs configured _and_ the current VA is the
primary perspective. Then I fixed all the tests, which involved adding
some backends in the unittests, and spinning up `remoteva-c` in the
integration tests.
As a reminder, the `boulder va` command always considers itself the
primary perspective, while `boulder remoteva` gives itself a perspective
based on its config.
I wound up backing out the code in `NewValidationAuthorityImpl` because
right now our remote VAs are actually running the `boulder va` command,
so they would error out in prod, even though our actual primary
perspective does have enough backends. So this wound up as a test-only
change.
Previously this was a configuration field.
Ports `maxAllowedFailures()` from `determineMaxAllowedFailures()` in
#7794.
Test updates:
Remove the `maxRemoteFailures` param from `setup` in all VA tests.
Some tests were depending on setting this param directly to provoke
failures.
For example, `TestMultiVAEarlyReturn` previously relied on "zero allowed
failures". Since the number of allowed failures is now 1 for the number
of remote VAs we were testing (2), the VA wasn't returning early with an
error; it was succeeding! To fix that, make sure there are two failures.
Since two failures from two RVAs wouldn't exercise the right situation,
add a third RVA, so we get two failures from three RVAs.
Similarly, TestMultiCAARechecking had several test cases that omitted
this field, effectively setting it to zero allowed failures. I updated
the "1 RVA failure" test case to expect overall success and added a "2
RVA failures" test case to expect overall failure (we previously
expected overall failure from a single RVA failing).
In TestMultiVA I had to change a test for `len(lines) != 1` to
`len(lines) == 0`, because with more backends we were now logging more
errors, and finding e.g. `len(lines)` to be 2.
This means that most traffic will go to the authz URLs with account.
After this has been deployed for 30 days (the max lifetime of an order),
we can remove support for the old paths.
Part of #7683
Add a new WFE & nonce config field, `NonceHMACKey`, which uses the new
`cmd.HMACKeyConfig` type. Deprecate the `NoncePrefixKey` config field.
Generalize the error message when validating `HMACKeyConfig` in
`config`.
Remove the deprecated `UseDerivablePrefix` config field, which is no
longer used anywhere.
Part of #7632
- Added a new key-value ratelimit
`FailedAuthorizationsForPausingPerDomainPerAccount` which is incremented
each time a client fails a validation.
- As long as capacity exists in the bucket, a successful validation
attempt will reset the bucket back to full capacity.
- Upon exhausting bucket capacity, the RA will send a gRPC to the SA to
pause the `account:identifier`. Further validation attempts will be
rejected by the [WFE](https://github.com/letsencrypt/boulder/pull/7599).
- Added a new feature flag, `AutomaticallyPauseZombieClients`, which
enables automatic pausing of zombie clients in the RA.
- Added a new RA metric `paused_pairs{"paused":[bool],
"repaused":[bool], "grace":[bool]}` to monitor use of this new
functionality.
- Updated `ra_test.go` `initAuthorities` to allow accessing the
`*ratelimits.RedisSource` for checking that the new ratelimit functions
as intended.
Co-authored-by: @pgporada
Fixes https://github.com/letsencrypt/boulder/issues/7738
---------
Co-authored-by: Phil Porada <pporada@letsencrypt.org>
Co-authored-by: Phil Porada <philporada@gmail.com>
Goodkey has two ways to detect a key as weak: it runs a variety of
algorithmic checks (such as Fermat factorization and rocacheck), or the
key can be listed in a "weak key file". Similarly, it has two ways to
detect a key as blocked: it can call a generic function (which we use to
query our database), or the key can be listed in a "blocked key file".
This is two methods too many. Reliance on files of weak or blocked keys
introduces unnecessary complexity to both the implementation and
configuration of the goodkey package. Remove both "key file" options and
delete all code which supported them.
Also remove //test/block-a-key, as it was only used to generate these
test files.
IN-10762 tracked the removal of these files in prod.
Fixes https://github.com/letsencrypt/boulder/issues/7748
TestTraces is designed to test whether our Open Telemetry tracing system
is working: that spans are being output, that they have the appropriate
parents, etc. It should not be testing whether Boulder took a specific
path through its code -- that's the domain of package-specific unit
tests. Simplify TestTraces to the point that it is asserting (nearly)
the bare minimum about the set of operations Boulder performs.
Add a new method, `BatchIncrement`, to issue `IncrBy` (instead of `Set`)
to Redis. This helps prevent the race condition that allows bursts of
near-simultaneous requests to, effectively, spend the same token.
Call this new method when incrementing an existing key. New keys still
need to use `BatchSet` because Redis doesn't have a facility to, within
a single operation, increment _or_ set a default value if none exists.
Add a new feature flag, `IncrementRateLimits`, gating the use of this
new method.
CPS Compliance Review: This feature flag does not change any behaviour
that is described or constrained by our CP/CPS. The closest relation
would just be API availability in general.
Fixes#7780
- Add `Perspective` and `RIR` fields to the remote-va configuration
- Configure RVA ValidationAuthorityImpl instances with the contents of
the JSON configuration
- Configure VA ValidationAuthorityImpl instances with the constant
`va.PrimaryPerspective`
- Log `Perspective` for non-Primary Perspectives, per the MPIC
requirements in section 5.4.1 (2) vii of the BRs. Also log the RIR for
posterity.
- Introduce `ValidationResult` RPC fields `Perspective` and `Rir`, which
are not currently used but will be required for corroboration in #7616
Fixes https://github.com/letsencrypt/boulder/issues/7613
Part of https://github.com/letsencrypt/boulder/issues/7615
Part of https://github.com/letsencrypt/boulder/issues/7616
The CAB/F Debian weak keys (https://github.com/cabforum/Debian-weak-keys)
repository contains a bunch of DER encoded private keys that we should ensure
are blocked. I hacked up the block-a-key tool to output a base64 encoded SPKI
hash from an arbitrary PEM formatted private key file.
This is our only use of MariaDB's "INSERT ... RETURNING" syntax, which
does not exist in MySQL and Vitess. Add a feature flag which removes our
use of this feature, so that we can easily disable it and then re-enable
it if it turns out to be too much of a performance hit.
Also add a benchmark showing that the serial-insertion approach is
slower, but perhaps not debilitatingly so.
Part of https://github.com/letsencrypt/boulder/issues/7718
Add a new SerialPrefixHex field to the CA's config, which takes a
two-character hexadecimal string to use as the serial prefix. This
matches the way that the OCSP Responder's acceptable serial prefixes are
configured, and is easier for human operators to configure than raw
integers.
At the same time, change the type of the CA's internal serial prefix
from `int` to `byte`, using the type system to enforce its 8-bit length.
Fixes#7213
Clean up how we handle identifiers throughout the Boulder codebase by
- moving the Identifier protobuf message definition from sa.proto to
core.proto;
- adding support for IP identifier to the "identifier" package;
- renaming the "identifier" package's exported names to be clearer; and
- ensuring we use the identifier package's helper functions everywhere
we can.
This will make future work to actually respect identifier types (such as
in Authorization and Order protobuf messages) simpler and easier to
review.
Part of https://github.com/letsencrypt/boulder/issues/7311
Begin testing on go1.23. To facilitate this, also update /x/net,
golangci-lint, staticcheck, and pebble-challtestsrv to versions which
support go1.23. As a result of these updates, also fix a handful of new
lint findings, mostly regarding passing non-static (i.e. potentially
user-controlled) format strings into Sprintf-style functions.
Additionally, delete one VA unittest that was duplicating the checks
performed by a different VA unittest, but with a context timeout bug
that caused it to break when go1.23 subtly changed DialContext behavior.
Have the WFE ask the RA for authorizations, rather than asking the SA
directly. This extra layer of indirection allows us to filter out
challenges which have been disabled, so that clients don't think they
can attempt challenges that we have disabled.
Also shuffle the order of challenges within the authz objects rendered
by the API. We used to have code which does this at authz creation time,
but of course that was completely ineffectual once we stored the
challenges as just a bitmap in the database.
Update the WFE unit tests to mock RA.GetAuthorization instead of
SA.GetAuthorization2. This includes making the mock more accurate, so
that (e.g.) valid authorizations contain valid challenges, and the
challenges have their correct types (e.g. "http-01" instead of just
"http"). Also update the OTel tracing test to account for the new RPC.
Part of https://github.com/letsencrypt/boulder/issues/5913
- Add feature flag `UseKvLimitsForNewOrder`
- Add feature flag `UseKvLimitsForNewAccount`
- Flush all Redis shards before running integration or unit tests, this
avoids false positives between local testing runs
Fixes#7664
Blocked by #7676
- Check `CertificatesPerDomain` at newOrder and spend at Finalize time.
- Check `CertificatesPerAccountPerDomain` at newOrder and spend at
Finalize time.
- Check `CertificatesPerFQDNSet` at newOrder and spend at Finalize time.
- Fix a bug
in`FailedAuthorizationsPerDomainPerAccountSpendOnlyTransaction()` which
results in failed authorizations being spent for the exact FQDN, not the
eTLD+1.
- Remove redundant "max names" check at transaction construction time
- Enable key-value rate limits in the RA
- Instruct callers to call *Decision.Result() to check the result of
rate limit transactions
- Preserve the Transaction within the resulting *Decision
- Generate consistently formatted verbose errors using the metadata
found in the *Decision
- Fix broken key-value rate limits integration test in
TestDuplicateFQDNRateLimit
Fixes#7577
Replace all of Boulder's usage of the Go stdlib "math/rand" package with
the newer "math/rand/v2" package which first became available in go1.22.
This package has an improved API and faster performance across the
board.
See https://go.dev/blog/randv2 and https://go.dev/blog/chacha8rand for
details.
Move the two lint-configuration keys, LintConfig and IgnoreLints, from
the top-level CA.Issuance config stanza into each individual
CA.Issuance.CertProfiles stanza. This allows us to have
differently-configured lints for different profiles, to ensure that our
linting regime is as strict as possible.
Without this change, it would be necessary for us to ignore both the
"common name included" and the "no subject key id" lints at the
top-level, when in fact each of those warnings only triggers on one of
our two profiles.
Fixes https://github.com/letsencrypt/boulder/issues/7635
Call `RA.UnpauseAccount` for valid unpause form submissions.
Determine and display the appropriate outcome to the Subscriber based on
the count returned by `RA.UnpauseAccount`:
- If the count is zero, display the "Account already unpaused" message.
- If the count equals the max number of identifiers allowed in a single
request, display a page explaining the need to visit the unpause URL
again.
- Otherwise, display the "Successfully unpaused all N identifiers"
message.
Apply per-request timeout from the SFE configuration.
Part of https://github.com/letsencrypt/boulder/issues/7406
Add three new keys to the CA's ProfileConfig:
- OmitKeyEncipherment causes the keyEncipherment Key Usage to be omitted
from certificates with RSA public keys. We currently include it for
backwards compatibility with TLS 1.1 servers that don't support modern
cipher suites, but this KU is completely useless as of TLS 1.3.
- OmitClientAuth causes the tlsClientAuthentication Extended Key Usage
to be omitted from all certificates. We currently include it to support
any subscribers who may be relying on it, but Root Programs are moving
towards single-purpose hierarchies and its inclusion is being
discouraged.
- OmitSKID causes the Subject Key Identifier extension to be omitted
from all certificates. We currently include this extension because it is
recommended by RFC 5280, but it serves little to no practical purpose
and consumes a large number of bytes, so it is now NOT RECOMMENDED by
the Baseline Requirements.
Make substantive changes to issuer.requestValid and issuer.Prepare to
implement the desired behavior for each of these options. Make a very
slight change to ra.matchesCSR to generally allow for serverAuth-only
EKUs. Improve the unit tests of both the //ca and //issuance packages to
cover the new behavior.
Part of https://github.com/letsencrypt/boulder/issues/7610
One of our goals with profiles is to allow different profiles to have
different validity periods. While the profiles already had the ability
to enforce different maximum backdates and validities, the CA still had
separate global configuration for what the backdate and validity period
should actually be.
Move the computation of the notBefore and notAfter timestamps into the
issuance package, so that it can be based on the profile's configured
backdate and validity durations. Deprecate the global "backdate" and
"expiry" config fields, as they are no longer used. Finally, add more
validation for the profile's backdate and validity.
Part of https://github.com/letsencrypt/boulder/issues/7610
Add a new profile config key named "OmitCommonName" which, if set to
`true`, causes the issuance package to exclude the CN from the resulting
certificate even if the initiating IssuanceRequest specified one.
Deprecate the old "AllowCommonName" config key, so that it no longer has
any effect, rather than causing the issuance package to fully reject
IssuanceRequests containing a CN.
This allows for more graceful variation between profiles, since we know
that excluding the Common Name is always safe.
Part of https://github.com/letsencrypt/boulder/issues/7610
Change the way profiles are configured at the WFE to allow them to be
accompanied by descriptive strings. Augment the construction of the
directory resource's "meta" sub-object to include these profile names
and descriptions.
This config swap is safe, since no Boulder WFE instance is configured
with `CertificateProfileNames` yet.
Fixes https://github.com/letsencrypt/boulder/issues/7602
These profile variables are set to "true" everywhere, and we have no
intention of ever setting them to "false" anywhere. Deprecate them so
that they can be removed in the future, and to reduce the chances of
confusion when new profile variables are introduced in the near future.
Part of https://github.com/letsencrypt/boulder/issues/7610
This change guarantees compliance with CA/BF Ballot SC-073 "Compromised
and Weak Keys", which requires that at least 100 rounds of Fermat
Factorization be attempted:
> Section 6.1.1.3 Subscriber Key Pair Generation
> The CA SHALL reject a certificate request if... The Public Key
corresponds to an industry-demonstrated weak Private Key. For requests
submitted on or after November 15, 2024,... In the case of Close Primes
vulnerability (https://fermatattack.secvuln.info/), the CA SHALL reject
weak keys which can be factored within 100 rounds using Fermat’s
factorization method.
We choose 110 rounds to ensure a margin above and beyond the requirements.
Fixes https://github.com/letsencrypt/boulder/issues/7558