We were adding the lookback period to `clk.Now()` but should have been
subtracting it. Includes a unittest, which I've verified fails against
the pre-fix code.
The conditional introduced in
https://github.com/letsencrypt/boulder/pull/8067 contained a bug left
over from an earlier draft of the PR. Remove the zero-length check to
ensure the code matches the documented intent.
Add a new custom lint which sends CRLs to PKIMetal, and configure it to
run in our integration test environment. Factor out most of the code
used to talk to the PKIMetal API so that it can be shared by the two
custom lints which do so. Add the ability to configure lints to the
CRLProfileConfig, so that zlint knows where to load the necessary custom
config from.
Give boulder-observer the ability to detect if the CRL it fetches is the
CRL it expects, by comparing that CRLs issuingDistributionPoint
extension to the prober's configured URL. Only do this if instructed to
(by configuring the CRL prober as "partitioned") because non-partitioned
CRLs do not necessarily contain an IDP.
Fixes https://github.com/letsencrypt/boulder/issues/7527
Update the SA to re-query the database for the updated account after
deactivating it, and return this to the RA. Update the RA to pass this
value through to the WFE. Update the WFE to return this value, rather
than locally modifying the pre-deactivation account object, if it gets
one (for deployability).
Also remove the RA's requirement that the request object specify its
current status so that the request can be trimmed down to just an ID.
This proto change is backwards-compatible because the new
DeactivateRegistrationRequest's registrationID field has the same type
(int64) and field number (1) as corepb.Registration's id field.
Part of https://github.com/letsencrypt/boulder/issues/5554
Give cert-checker the ability to load zlint configs, so that it can be
configured to talk to PKIMetal in CI and hopefully in staging/production
in the future.
Also update how cert-checker executes lints, so that it uses a real lint
registry instead of using the global registry and passing around a
dictionary of lints to filter out of the results.
Fixes https://github.com/letsencrypt/boulder/issues/7786
Replace the bpkilint container with a new bpkimetal container. Update
our custom lint which calls out to that API to speak PKIMetal's (very
similar) protocol instead. Update our zlint custom configuration to
configure this updated lint.
Fixes https://github.com/letsencrypt/boulder/issues/8009
- Plumb the "replaces" value from the WFE through to the SA via the RA
- Store validated "replaces" value for new orders in the orders table
- Reflect the stored "replaces" value to subscribers in the order object
- Reorder CertificateProfileName before Replaces/ReplacesSerial in RA
and SA protos for consistency
Fixes#8034
Return "alreadyReplaced" in addition to HTTP 409 Conflict to signal that
an order indicates that it replaces a certificate which already has a
replacement order.
We have upgraded to go1.24.1 in production, and no longer need to test
go1.23.x. Updating the version in our go.mod also allows us to begin
using x509.Certificate.Policies instead of .PolicyIdentifiers.
Add a new "MPICFullResults" feature flag. When this flag is enabled in
the VA, it will wait for all Remote VAs to return their results for both
Domain Control Validation and CAA checking, rather than short-circuiting
as soon as it has seen enough results to know whether corroboration will
or will not be achieved.
We make this change because waiting for these to return honestly doesn't
take that long, because we do validation (although not CAA rechecking)
asynchronously, and because it improves the quality of our MPIC quorum
summary logs (so we don't always say only 3/4 concurred because the
fourth was cancelled).
Fixes https://github.com/letsencrypt/boulder/issues/7809
Populate the new x509.Certificate.Policies field everywhere we currently populate the x509.Certificate.PolicyIdentifiers field. This allows Go to use whichever field it prefers (go1.23 prefers PolicyIdentifiers, go1.24 prefers Policies) as the source of truth when serializing a certificate.
Part of https://github.com/letsencrypt/boulder/issues/7148
When we receive an update-account request which is not empty, but
doesn't contain the "contact" field, don't assume that they want to
remove their contacts. Only remove contacts if the "contact" field is
present, but empty.
Add a unit test and an integration test which will catch regressions in
this behavior.
Previously this test was passing not because the VA was returning early,
but because the fake HTTP server was only sleeping for 1000 nanoseconds
instead of 1000 milliseconds. The test cases were not exercising the
VA's early-return codepath, because they do not include sufficiently
high ratios of passing or failing remotes to hit quorum early.
Fix the sleep time so the fake HTTP server works as expected, and reduce
the (desired) sleep time from 1000ms to 100ms because that's more than
sufficient for the behavior we're testing.
Fix and diversify the test cases to actually hit positive or negative
quorum, so that the VA's early-return codepath is actually exercised.
This PR will be followed by a non-test PR which removes this
early-return codepath and modifies this test further, but I thought it
was important to have this test in fully working order before modifying
the code it tests.
Part of https://github.com/letsencrypt/boulder/issues/7809
PR #8018 integrated the email-exporter service with WFE, updating
wfe.NewAccount and wfe.updateAccount to submit valid email contacts to
the Salesforce Pardot API. However, our new_or_updated_contact metric
shows that (account) contact updates currently exceed the highest
Salesforce tier’s daily submission limit by several times.
This change can be reverted if additional filtering logic reduces
updated (+ new) account contacts below the daily submission limit.
Remove crl-updater from the list of services run by startservers.py, so
that it isn't running at the same time as the crl-updater instances run
by specific integration tests. In return, add a new integration test
which starts crl-updater and waits for it to listen on its debug port,
just like startservers does.
Also make the existing crl-updater integration tests more robust and
more parallelizable by having them always reset the leasedUntil column
before executing the updater, instead of requiring each individual test
to perform that reset.
Fixes https://github.com/letsencrypt/boulder/issues/7590
Remove some RA tests that were checking for errors specific to the split
issuance flow. Make one of the tests test GetSCTs directly, which makes
for a much nicer test!
Add a new boulder service, email-exporter, which uses the Pardot API
client added in #8016 and the email.Exporter gRPC service added in
#8017.
Add pardot-test-srv, a test-only service for mocking communication with
Salesforce OAuth and Pardot APIs in non-production environments. Since
Salesforce does not provide Pardot functionality in developer sandboxes,
pardot-test-srv must run in all non-production environments (e.g.,
sre-development and staging).
Integrate the email-exporter service with the WFE and modify
WFE.NewAccount and WFE.UpdateAccount to submit valid email contacts.
Ensure integration tests verify that contacts eventually reach
pardot-test-srv.
Update configuration where necessary to:
- Build pardot-test-srv as a standalone binary.
- Bring up pardot-test-srv and cmd/email-exporter for integration
testing.
- Integrate WFE with cmd/email-exporter when running test/config-next.
Closes#7966
Add an `identifier` field to the `va.PerformValidationRequest` proto, which will soon replace its `dnsName` field.
Accept and prefer the `identifier` field in every VA function that uses this struct. Don't (yet) assume it will be present.
Throughout the VA, accept and handle the IP address identifier type. Handling is similar to DNS names, except that `getAddrs` is not called, and consider that:
- IPs are represented in a different field in the `x509.Certificate` struct.
- IPs must be presented as reverse DNS (`.arpa`) names in SNI for [TLS-ALPN-01 challenge requests](https://datatracker.ietf.org/doc/html/rfc8738#name-tls-with-application-layer-).
- IPv6 addresses are enclosed in square brackets when composing or parsing URLs.
For HTTP-01 challenges, accept redirects to bare IP addresses, which were previously rejected.
Fixes#2706
Part of #7311
When checking CAA, issuance is allowed if the relevant RRSet (as defined
in RFC 8659, Section 3) does not contain any records of the right
Property kind (issue or issuewild) for the kind of checking being
attempted. Previously, we correctly detected that a non-wildcard
issuance attempt could short-circuit our validation logic if no issue
records are present. However, we did not do a similar short-circuit for
wildcard issuance attempts when no issue records and no issuewild
records are present.
Add a test which demonstrates that a nearly-empty RRSet accidentally
forbade issuance of wildcard certs. Update our logic to perform the "no
relevant records" check slightly later, so that it catches both the
wildcard and non-wildcard cases, causing the new test to pass.
Fixes https://github.com/letsencrypt/boulder/issues/8032
Compute the width of the ARI suggested renewal window as 2% of the
validity period. This means that 90-day certificates have their
suggested window shrink slightly from 48 hours to 43.2 hours, and gives
six-day (160h) certs a suggested window 3.2 hours wide.
Also move the center of that window to the midpoint of the certificate
validity period for certs which are valid for less than 10 days, so that
operators have (proportionally) a little more time to respond to renewal
issues.
Fixes https://github.com/letsencrypt/boulder/issues/7996
Fix a remaining edge case after #7468: one call to `newIPError` did not
account for when we retry *successfully,* but then are served a redirect
which errors. In those cases, our `client.Do` call results in our
redirect handler `processRedirect` appending yet another validation
record to `records`, which was missed.
Fixes#7347
In a few places within the SA, we use explicit transactions to wrap
read-then-update style operations. Because we set the transaction
isolation level on a per-session basis, these transactions do not in
fact change their isolation level, and therefore generally remain at the
default isolation level of REPEATABLE READ.
Unfortunately, we cannot resolve this simply by converting the SELECT
statements into SELECT...FOR UPDATE statements: although this would fix
the issue by making those queries into locking statements, it also
triggers what appears to be an InnoDB bug when many transactions all
attempt to select-then-insert into a table with both a primary key and a
separate unique key, as the crlShards table has. This causes the
integration tests in GitHub Actions, which run with an empty database
and therefore use the needToInsert codepath instead of the update
codepath, to consistently flake.
Instead, resolve the issue by having the UPDATE statements specify that
the value of the leasedUntil column is still the same as was read by the
initial SELECT. Although two crl-updaters may still attempt these
transactions concurrently, the UPDATE statements will still be fully
sequenced, and the latter one will fail.
Part of https://github.com/letsencrypt/boulder/issues/8031
Replace DCV and CAA checks (PerformValidation and IsCAAValid) in
va/va.go and va/caa.go with their MPIC compliant counterparts (DoDCV and
DoCAA) in va/vampic.go. Deprecate EnforceMultiCAA and EnforceMPIC and
default code paths as though they are both true. Require that RIR and
Perspective be set for primary and remote VAs.
Fixes#7965Fixes#7819
Rather than using context.WithTimeout to magically convert a timeout to
a deadline for us, compute the deadline ourselves and then set it
directly on the context.
Add MaxNames to the set of things that can be configured on a
per-profile basis. Remove all references to the RA's global maxNames,
replacing them with reference's to the current profile's maxNames. Add
code to the RA's main() to copy a globally-configured MaxNames into each
profile, for deployability.
Also remove any understanding of MaxNames from the WFE, as it is
redundant with the RA and is not configured in staging or prod. Instead,
hardcode the upper limit of 100 into the ratelimit package itself.
Fixes https://github.com/letsencrypt/boulder/issues/7993
Add a new RPC to the CA: `IssueCertificate` covers issuance of both the
precertificate and the final certificate. In between, it calls out to
the RA's new method `GetSCTs`.
The RA calls the new `CA.IssueCertificate` if the `UnsplitIssuance`
feature flag is true.
The RA had a metric that counted certificates by profile name and hash.
Since the RA doesn't receive a profile hash in the new flow, simply
record the total number of issuances.
Fixes https://github.com/letsencrypt/boulder/issues/7983
Deprecate the "InsertAuthzsIndividually" feature flag, which has been
set to true in both Staging and Production. Delete the code guarded
behind that flag being false, namely the ability of the MultiInserter to
return the newly-created IDs from all of the rows it has inserted. This
behavior is being removed because it is not supported in MySQL / Vitess.
Fixes https://github.com/letsencrypt/boulder/issues/7718
---
> [!WARNING]
> ~~Do not merge until IN-10737 is complete~~
Update from go1.23.1 to go1.23.6 for our primary CI and release builds.
This brings in a few security fixes that aren't directly relevant to us.
Add go1.24.0 to our matrix of CI and release versions, to prepare for
switching to this next major version in prod.
These paths receive (literally) zero traffic, and they require the WFE
to duplicate the RA's authorization lifetime configuration. Since that
configuration is now per-profile, the WFE can no longer easily replicate
it, and the resulting staleness calculations will be wrong. Remove the
duplicated configuration, remove the unused endpoints that rely on it,
and remove the staleness-checking code which supported those endpoints.
Leave the non-ACME /get/ endpoint for certificates in place, because
checking staleness for those does not require any additional
configuration, and having a non-ACME serial-based API for certificates
is a good thing.
Fixes https://github.com/letsencrypt/boulder/issues/8007
The crl-storer passes along Cache-Control and Expires from the
crl-updater (because the crl-updater knows the UpdatePeriod).
The crl-updater calculates the Expires header based on when it expects
to update the CRL, plus a margin of error.
Fixes#8004
Increase pkilint timeout from 200ms to 2s. In #8006 I found that errors
were stemming from timeouts talking to the bpkilint container. These
probably showed up in TestRevocation particularly because that
integration test now issues for many certificates in parallel. Pkilint's
slowness, combined with the relatively small number of cores in CI,
probably resulted in some requests taking too long.
Remove the RA's deprecated top-level config keys which used to control
order and authz lifetimes. Make the new profile-based config keys which
replaced them required.
Since configuring a profile and default profile name is now mandatory,
always supply a profile name to the CA when requesting issuance.
Fixes https://github.com/letsencrypt/boulder/issues/7986