Change most functions in `ratelimits` to use full ACMEIdentifier(s) as
arguments, instead of using their values as strings. This makes the
plumbing from other packages more consistent, and allows us to:
Rename `FQDNsToETLDsPlusOne` to `coveringIdentifiers` and handle IP
identifiers, parsing IPv6 addresses into their covering /64 prefixes for
CertificatesPerDomain[PerAccount] bucket keys.
Port improved IP/CIDR validation logic to NewRegistrationsPerIPAddress &
PerIPv6Range.
Rename `domain` parts of bucket keys to either `identValue` or
`domainOrCIDR`.
Rename other internal functions to clarify that they now handle
identifier values, not just domains.
Add the new reserved IPv6 address range from RFC 9780.
For deployability, don't (yet) rename rate limits themselves; and
because it remains the name of the database table, preserve the term
`fqdnSets`.
Fixes#8223
Part of #7311
All of the identifiers being passed into the bucket construction helpers
have already passed through policy.WellFormedIdentifiers in the WFE. We
can trust that function, and our own ability to construct bucket keys,
to reduce the amount of revalidation we do before sending bucket keys to
redis.
The validateIdForName function is still used to validate override bucket
keys loaded from yaml.
Add `identifier` fields, which will soon replace the `dnsName` fields,
to:
- `corepb.Authorization`
- `corepb.Order`
- `rapb.NewOrderRequest`
- `sapb.CountFQDNSetsRequest`
- `sapb.CountInvalidAuthorizationsRequest`
- `sapb.FQDNSetExistsRequest`
- `sapb.GetAuthorizationsRequest`
- `sapb.GetOrderForNamesRequest`
- `sapb.GetValidAuthorizationsRequest`
- `sapb.NewOrderRequest`
Populate these `identifier` fields in every function that creates
instances of these structs.
Use these `identifier` fields instead of `dnsName` fields (at least
preferentially) in every function that uses these structs. When crossing
component boundaries, don't assume they'll be present, for
deployability's sake.
Deployability note: Mismatched `cert-checker` and `sa` versions will be
incompatible because of a type change in the arguments to
`sa.SelectAuthzsMatchingIssuance`.
Part of #7311
Add MaxNames to the set of things that can be configured on a
per-profile basis. Remove all references to the RA's global maxNames,
replacing them with reference's to the current profile's maxNames. Add
code to the RA's main() to copy a globally-configured MaxNames into each
profile, for deployability.
Also remove any understanding of MaxNames from the WFE, as it is
redundant with the RA and is not configured in staging or prod. Instead,
hardcode the upper limit of 100 into the ratelimit package itself.
Fixes https://github.com/letsencrypt/boulder/issues/7993
Update from go1.23.1 to go1.23.6 for our primary CI and release builds.
This brings in a few security fixes that aren't directly relevant to us.
Add go1.24.0 to our matrix of CI and release versions, to prepare for
switching to this next major version in prod.
Add a new `ratelimits.NewTransactionBuilderWithLimits` constructor which
takes pre-populated rate limit data, instead of filenames for reading it
off disk.
Use this new constructor to change rate limits during RA tests, instead
of using extra `testdata` files.
Fix ARI renewals' exception from rate limits: consider `isARIRenewal` as
part of the `isRenewal` arg to `checkNewOrderLimits`.
Remove obsolete RA tests for rate limits that are now only checked in
the WFE.
Update remaining new order rate limit tests from deprecated `ratelimit`s
to new Redis `ratelimits`.
The zero value for `limit` is invalid, so returning `nil` in error cases
avoids silently returning invalid limits (and means that if code makes a
mistake and references an invalid limit it will be an obvious clear
stack trace).
This is more consistent, since the methods on `limit` use a pointer
receiver. Also, since `limit` is a fairly large object, this saves some
copying.
Related to #7803 and #7797.
In the FailedAuthorizations limits, there was code that intentionally
ignored errLimitDisabled errors (`errors.Is(err, errLimitDisabled)`).
However, that that resulted in those functions later using a returned
`limit` value that was invalid (i.e. its zero value). That happened to
trigger some later checks in validateTransaction. Specifically this
check failed:
if txn.cost > txn.limit.Burst {
// error
When txt.limit.Burst is zero, this will always fail.
This problem doesn't really show up in prod, where all the limits are
configured. But it showed up in tests, specifically
TestPerformValidation_FailedValidationsTriggerPauseIdentifiersRatelimit,
where the limits are constructed using a simplified config that leaves
most of them disabled.
In this change, I tried to make handling of errLimitDisabled more
consistent, and always return an allow-only transaction as early as
possible instead of falling through the error condition.
Where that wasn't possible, I used a boolean to record whether the
result of `builder.getLimit()` was valid before referencing any of its
fields.
I also added some "shouldn't happen" errors to catch this problem
earlier if it recurs.
I removed some "skip disabled limit" comments because those say "what
the code does" (which the code also says), not "why the code does it".
Fixes the test failures in #7797.
- Added a new key-value ratelimit
`FailedAuthorizationsForPausingPerDomainPerAccount` which is incremented
each time a client fails a validation.
- As long as capacity exists in the bucket, a successful validation
attempt will reset the bucket back to full capacity.
- Upon exhausting bucket capacity, the RA will send a gRPC to the SA to
pause the `account:identifier`. Further validation attempts will be
rejected by the [WFE](https://github.com/letsencrypt/boulder/pull/7599).
- Added a new feature flag, `AutomaticallyPauseZombieClients`, which
enables automatic pausing of zombie clients in the RA.
- Added a new RA metric `paused_pairs{"paused":[bool],
"repaused":[bool], "grace":[bool]}` to monitor use of this new
functionality.
- Updated `ra_test.go` `initAuthorities` to allow accessing the
`*ratelimits.RedisSource` for checking that the new ratelimit functions
as intended.
Co-authored-by: @pgporada
Fixes https://github.com/letsencrypt/boulder/issues/7738
---------
Co-authored-by: Phil Porada <pporada@letsencrypt.org>
Co-authored-by: Phil Porada <philporada@gmail.com>