boulder

Commit Graph

Author	SHA1	Message	Date
James Renken	48d5ad3c19	ratelimits: Add IP address identifier support (#8221 ) Change most functions in `ratelimits` to use full ACMEIdentifier(s) as arguments, instead of using their values as strings. This makes the plumbing from other packages more consistent, and allows us to: Rename `FQDNsToETLDsPlusOne` to `coveringIdentifiers` and handle IP identifiers, parsing IPv6 addresses into their covering /64 prefixes for CertificatesPerDomain[PerAccount] bucket keys. Port improved IP/CIDR validation logic to NewRegistrationsPerIPAddress & PerIPv6Range. Rename `domain` parts of bucket keys to either `identValue` or `domainOrCIDR`. Rename other internal functions to clarify that they now handle identifier values, not just domains. Add the new reserved IPv6 address range from RFC 9780. For deployability, don't (yet) rename rate limits themselves; and because it remains the name of the database table, preserve the term `fqdnSets`. Fixes #8223 Part of #7311	2025-06-12 11:47:32 -07:00
Aaron Gable	d951304b54	Ratelimits: don't validate our own constructed bucket keys (#8225 ) All of the identifiers being passed into the bucket construction helpers have already passed through policy.WellFormedIdentifiers in the WFE. We can trust that function, and our own ability to construct bucket keys, to reduce the amount of revalidation we do before sending bucket keys to redis. The validateIdForName function is still used to validate override bucket keys loaded from yaml.	2025-06-03 15:43:07 -07:00
Aaron Gable	23608e19c5	Simplify docker-compose network setup (#8214 ) Remove static IPs from services that can be reached by their service name. Remove consulnet and redisnet, and have the services which connected to those network connect directly to bouldernet instead. Instruct docker-compose to only dynamically allocate IPs from the upper half of the bouldernet subset, to avoid clashing with the few static IPs we still specify.	2025-05-30 13:23:27 -07:00
James Renken	ac68828f43	Replace most uses of net.IP with netip.Addr (#8205 ) Retain `net.IP` only where we directly work with `x509.Certificate` and friends. Fixes #5925 Depends on #8196	2025-05-27 15:05:35 -07:00
James Renken	648ab05b37	policy: Support IP address identifiers (#8173 ) Add `pa.validIP` to test IP address validity & absence from IANA reservations. Modify `pa.WillingToIssue` and `pa.WellFormedIdentifiers` to support IP address identifiers. Add a map of allowed identifier types to the `pa` config. Part of #8137	2025-05-14 13:49:51 -07:00
James Renken	3f879ed0b4	Add Identifiers to Authorization & Order structs (#7961 ) Add `identifier` fields, which will soon replace the `dnsName` fields, to: - `corepb.Authorization` - `corepb.Order` - `rapb.NewOrderRequest` - `sapb.CountFQDNSetsRequest` - `sapb.CountInvalidAuthorizationsRequest` - `sapb.FQDNSetExistsRequest` - `sapb.GetAuthorizationsRequest` - `sapb.GetOrderForNamesRequest` - `sapb.GetValidAuthorizationsRequest` - `sapb.NewOrderRequest` Populate these `identifier` fields in every function that creates instances of these structs. Use these `identifier` fields instead of `dnsName` fields (at least preferentially) in every function that uses these structs. When crossing component boundaries, don't assume they'll be present, for deployability's sake. Deployability note: Mismatched `cert-checker` and `sa` versions will be incompatible because of a type change in the arguments to `sa.SelectAuthzsMatchingIssuance`. Part of #7311	2025-03-26 10:30:24 -07:00
James Renken	9f4b18c6ce	identifier: Rename FromDNSNames & AsProto; add ACMEIdentifiers named type (#8070 ) Rename `FromDNSNames` to `NewDNSSlice`, since it's exactly `NewDNS` except for slices. Rename `AsProto` to use the "To" prefix, since it's the opposite of "From". Add a named type `ACMEIdentifiers` so that we can add methods to slices. We will have a lot of slice handling code coming up, which this will make more elegant and readable. Add a comment to explain naming conventions in the `identifier` package. Part of #7311 Alternative to #8068	2025-03-19 17:03:39 -04:00
James Renken	cb94164b54	policy: Add initial Identifier support (#8064 ) Change WillingToIssue and WellFormedDomainNames to use Identifiers, and (for now) reject non-DNS identifiers. Part of #7311	2025-03-14 11:34:59 -07:00
Aaron Gable	a2141cb695	RA: Control MaxNames via profile (#8019 ) Add MaxNames to the set of things that can be configured on a per-profile basis. Remove all references to the RA's global maxNames, replacing them with reference's to the current profile's maxNames. Add code to the RA's main() to copy a globally-configured MaxNames into each profile, for deployability. Also remove any understanding of MaxNames from the WFE, as it is redundant with the RA and is not configured in staging or prod. Instead, hardcode the upper limit of 100 into the ratelimit package itself. Fixes https://github.com/letsencrypt/boulder/issues/7993	2025-02-27 15:51:00 -06:00
Aaron Gable	212a66ab49	Update go versions in CI and release (#7971 ) Update from go1.23.1 to go1.23.6 for our primary CI and release builds. This brings in a few security fixes that aren't directly relevant to us. Add go1.24.0 to our matrix of CI and release versions, to prepare for switching to this next major version in prod.	2025-02-19 14:37:01 -08:00
Samantha Frank	6402a2275f	ratelimits: Remove a metric and some labels that we're not finding useful (#7902 )	2024-12-20 08:44:08 -05:00
James Renken	62299362bd	ra/ratelimits: Update tests, use new TransactionBuilder constructor, fix ARI rate limit exception (#7869 ) Add a new `ratelimits.NewTransactionBuilderWithLimits` constructor which takes pre-populated rate limit data, instead of filenames for reading it off disk. Use this new constructor to change rate limits during RA tests, instead of using extra `testdata` files. Fix ARI renewals' exception from rate limits: consider `isARIRenewal` as part of the `isRenewal` arg to `checkNewOrderLimits`. Remove obsolete RA tests for rate limits that are now only checked in the WFE. Update remaining new order rate limit tests from deprecated `ratelimit`s to new Redis `ratelimits`.	2024-12-18 14:23:13 -08:00
Samantha Frank	11d543bd98	ratelimits: Correctly handle stale and concurrently initialized buckets (#7886 ) #7782 fixed an issue where concurrent requests to the same existing bucket ignored all but one rate limit spend. However, concurrent requests to the same empty bucket can still cause multiple initializations that skip all but one spend. Use BatchSetNotExisting (SETNX in Redis) to detect this scenario and then fall back to BatchIncrement (INCRBY in Redis). #7782 sets the TTL (Time-To-Live) of incremented buckets to the maximum possible burst for the applied limit. Because this TTL doesn’t match the TAT, these buckets can become "stale," holding a TAT in the past. Incrementing these stale buckets by cost * emissionInterval leaves the new TAT behind the current time, allowing clients who are sometimes idle to gain extra burst capacity. Instead, use BatchSet (SET in Redis) to overwrite the TAT to now + cost * emissionInterval. Though this introduces a similar race condition as empty buckets, it’s less harmful than granting extra burst capacity.	2024-12-17 12:42:51 -05:00
Jacob Hoffman-Andrews	40e100c297	doc: replace "leaky" with "token" bucket (#7881 ) Mostly we refer consistently to token bucket, but these two places (one of which is soon to be removed) still had the "leaky" terminology, which is potentially confusing.	2024-12-10 16:39:30 -08:00
Jacob Hoffman-Andrews	27e65f3e9f	ratelimits: add detail to error messages (#7871 ) For batch operations, include the operation and the number of keys in the error message. This should help diagnose whether we are getting `i/o timeout` errors disproportionately for larger requests, or for certain operations. Also, make the ignored errors part of the overall WFE request logs, which allows us to get additional context, like whether certain requesters or domain names are getting disproportionately many errors. Related to #7846.	2024-12-05 15:58:26 -08:00
Aaron Gable	d962c61067	RA and WFE tests: use inmem rate limit source (#7859 ) The purpose of these RA and WFE unit tests is to test how they deal with certain rate limit conditions, not to test talking to an actual redis instance. Streamline the tests by having them talk to an in-memory rate limits store, rather than a redis-backed one.	2024-12-03 14:52:16 -08:00
Aaron Gable	bac5602c6d	Always use INCRBY for redis rate limits (#7856 ) Deprecate the IncrementRateLimits feature flag, and always use the redis INCRBY instruction to update rate limit TATs. Fixes https://github.com/letsencrypt/boulder/issues/7855	2024-12-02 15:25:33 -08:00
Jacob Hoffman-Andrews	20fdcbcfe0	ratelimits: always use a pointer for limit (#7804 ) The zero value for `limit` is invalid, so returning `nil` in error cases avoids silently returning invalid limits (and means that if code makes a mistake and references an invalid limit it will be an obvious clear stack trace). This is more consistent, since the methods on `limit` use a pointer receiver. Also, since `limit` is a fairly large object, this saves some copying. Related to #7803 and #7797.	2024-11-15 13:45:07 -08:00
Jacob Hoffman-Andrews	26a9910911	ratelimits: improve disabled limit handling (#7803 ) In the FailedAuthorizations limits, there was code that intentionally ignored errLimitDisabled errors (`errors.Is(err, errLimitDisabled)`). However, that that resulted in those functions later using a returned `limit` value that was invalid (i.e. its zero value). That happened to trigger some later checks in validateTransaction. Specifically this check failed: if txn.cost > txn.limit.Burst { // error When txt.limit.Burst is zero, this will always fail. This problem doesn't really show up in prod, where all the limits are configured. But it showed up in tests, specifically TestPerformValidation_FailedValidationsTriggerPauseIdentifiersRatelimit, where the limits are constructed using a simplified config that leaves most of them disabled. In this change, I tried to make handling of errLimitDisabled more consistent, and always return an allow-only transaction as early as possible instead of falling through the error condition. Where that wasn't possible, I used a boolean to record whether the result of `builder.getLimit()` was valid before referencing any of its fields. I also added some "shouldn't happen" errors to catch this problem earlier if it recurs. I removed some "skip disabled limit" comments because those say "what the code does" (which the code also says), not "why the code does it". Fixes the test failures in #7797.	2024-11-13 16:23:50 -05:00
Kruti Sutaria	a79a830f3b	ratelimits: Auto pause zombie clients (#7763 ) - Added a new key-value ratelimit `FailedAuthorizationsForPausingPerDomainPerAccount` which is incremented each time a client fails a validation. - As long as capacity exists in the bucket, a successful validation attempt will reset the bucket back to full capacity. - Upon exhausting bucket capacity, the RA will send a gRPC to the SA to pause the `account:identifier`. Further validation attempts will be rejected by the [WFE](https://github.com/letsencrypt/boulder/pull/7599). - Added a new feature flag, `AutomaticallyPauseZombieClients`, which enables automatic pausing of zombie clients in the RA. - Added a new RA metric `paused_pairs{"paused":[bool], "repaused":[bool], "grace":[bool]}` to monitor use of this new functionality. - Updated `ra_test.go` `initAuthorities` to allow accessing the `*ratelimits.RedisSource` for checking that the new ratelimit functions as intended. Co-authored-by: @pgporada Fixes https://github.com/letsencrypt/boulder/issues/7738 --------- Co-authored-by: Phil Porada <pporada@letsencrypt.org> Co-authored-by: Phil Porada <philporada@gmail.com>	2024-11-08 13:51:41 -08:00
James Renken	4adc65fb7d	Rate limits: replace redis SET with INCRBY (#7782 ) Add a new method, `BatchIncrement`, to issue `IncrBy` (instead of `Set`) to Redis. This helps prevent the race condition that allows bursts of near-simultaneous requests to, effectively, spend the same token. Call this new method when incrementing an existing key. New keys still need to use `BatchSet` because Redis doesn't have a facility to, within a single operation, increment _or_ set a default value if none exists. Add a new feature flag, `IncrementRateLimits`, gating the use of this new method. CPS Compliance Review: This feature flag does not change any behaviour that is described or constrained by our CP/CPS. The closest relation would just be API availability in general. Fixes #7780	2024-11-04 11:20:44 -08:00
Samantha Frank	6e6c8fe480	ratelimits: Update errors to deep link to individual limits documentation (#7767 ) Updates rate limits error messages to deep link to new website docs added in https://github.com/letsencrypt/website/pull/1756.	2024-10-25 13:55:51 -04:00
Samantha Frank	d17d71cc6e	ratelimits: Rename bucket.go to transaction.go (#7753 )	2024-10-16 18:57:48 -04:00
Samantha Frank	c5dae06ffc	ratelimits: Add unit test coverage for TransactionBuilder methods (#7752 )	2024-10-09 19:30:51 -04:00
huochexizhan	a6dc97cb5b	fix: fix slice init length (#7731 ) Initialize a slice with a capacity of len(nameToString) rather than initializing the length of this slice. Signed-off-by: huochexizhan <huochexizhan@outlook.com>	2024-10-08 11:32:25 -04:00
Samantha Frank	2e19a362ec	WFE/RA: Default codepaths to CheckRenewalExemptionAtWFE: true (#7745 ) Also, remove redundant renewal checks in `RA.checkNewOrdersPerAccountLimit()` and `RA.checkCertificatesPerNameLimit()`. Part of #7511	2024-10-07 15:12:30 -04:00
Samantha Frank	d656afce78	ratelimits: Rename DomainsForRateLimiting() to clarify use (#7746 ) Rename as suggested by @jsha in #7729.	2024-10-07 14:56:36 -04:00
Samantha Frank	d850e633ae	ratelimits: Use full domain for the FailedAuthorizations limit (#7729 ) The key-value implementation for the Failed Authorizations limit mistakenly used eTLD+1 instead of the full domain, unlike its predecessor.	2024-09-30 14:50:35 -04:00
Samantha Frank	1b6e0867ae	ratelimits: Set a TTL each time we store bucket data in Redis (#7720 ) Set the Redis TTL to TAT (theoretical arrival time) plus a 10-minute buffer to account for possible clock skew.	2024-09-26 11:19:53 -04:00
Samantha Frank	14c0b2c3bb	ratelimits: Check at NewOrder and SpendOnly later (#7669 ) - Check `CertificatesPerDomain` at newOrder and spend at Finalize time. - Check `CertificatesPerAccountPerDomain` at newOrder and spend at Finalize time. - Check `CertificatesPerFQDNSet` at newOrder and spend at Finalize time. - Fix a bug in`FailedAuthorizationsPerDomainPerAccountSpendOnlyTransaction()` which results in failed authorizations being spent for the exact FQDN, not the eTLD+1. - Remove redundant "max names" check at transaction construction time - Enable key-value rate limits in the RA	2024-08-15 19:08:17 -04:00
Samantha Frank	31d0ff0f98	ratelimits: Replace Decision merging with always returning most restrictive (#7667 ) Fix a bug added in #7653 which sometimes attributed an "Allowed" `Transaction` to the amalgamated "Denied" `Decision`. Instead, always return the most restrictive `*Decision` in the batch. Remove a debug `fmt.Printf()` call added in #7653	2024-08-15 17:49:34 -04:00
Samantha Frank	6a3e9d725b	ratelimits: Provide verbose user-facing rate limit errors (#7653 ) - Instruct callers to call Decision.Result() to check the result of rate limit transactions - Preserve the Transaction within the resulting Decision - Generate consistently formatted verbose errors using the metadata found in the *Decision - Fix broken key-value rate limits integration test in TestDuplicateFQDNRateLimit Fixes #7577	2024-08-12 16:14:15 -04:00
Aaron Gable	61b484c13b	Update to math/rand/v2 (#7657 ) Replace all of Boulder's usage of the Go stdlib "math/rand" package with the newer "math/rand/v2" package which first became available in go1.22. This package has an improved API and faster performance across the board. See https://go.dev/blog/randv2 and https://go.dev/blog/chacha8rand for details.	2024-08-12 09:17:09 -07:00
Samantha Frank	a6e0fdc80e	ratelimits: Fix latency calculations (#7627 )	2024-07-24 22:07:33 -04:00
Phil Porada	ebb52990ca	test: Remove loop variable rebinding (#7587 ) [Gopls](https://github.com/golang/go/issues/66876) had a recent update which fixed my text editor from complaining about the "loop variable being captured by func literal". Fixes https://github.com/letsencrypt/boulder/issues/7454	2024-07-12 10:43:25 -04:00
Samantha Frank	55c274d132	ratelimits: Exempt renewals from NewOrdersPerAccount and CertificatesPerDomain (#7513 ) - Rename `NewOrderRequest` field `LimitsExempt` to `IsARIRenewal` - Introduce a new `NewOrderRequest` field, `IsRenewal` - Introduce a new (temporary) feature flag, `CheckRenewalExemptionAtWFE` WFE: - Perform renewal detection in the WFE when `CheckRenewalExemptionAtWFE` is set - Skip (key-value) `NewOrdersPerAccount` and `CertificatesPerDomain` limit checks when renewal detection indicates the the order is a renewal. RA: - Leave renewal detection in the RA intact - Skip renewal detection and (legacy) `NewOrdersPerAccount` and `CertificatesPerDomain` limit checks when `CheckRenewalExemptionAtWFE` is set and the `NewOrderRequest` indicates that the order is a renewal. Fixes #7508 Part of #5545	2024-06-27 16:39:31 -04:00
Samantha Frank	a38ed99341	ratelimits: Move transaction construction out of the WFE (#7557 ) - Shrink the number of public `ratelimits` methods by relocating two sizeable transaction constructors. Simplify the spend and refund call-sites in the WFE. - Spend calls now block instead of being called asynchronously.	2024-06-26 11:49:28 -04:00
Samantha	063db40db2	ratelimits: Fix cardinality explosion in overrideUsageGauge (#7548 )	2024-06-14 13:34:40 -04:00
Jacob Hoffman-Andrews	e198d3529d	wfe: check well-formedness of requested names early (#7530 ) This allows us to give a user-meaningful error about malformed names early on, instead of propagating internal errors from the new rate limiting system. This moves the well-formedness logic from `WillingToIssue` into a new function `WellFormedDomainNames`, which calls `ValidDomain` on each name and combines the errors into suberrors if there is more than one. `WillingToIssue` now calls `WellFormedDomainNames` to keep the existing behavior. Additionally, WFE calls `WellFormedDomainNames` before checking rate limits. This creates a slight behavior change: If an order contains both malformed domain names and wellformed but blocked domain names, suberrors will only be generated for the malformed domain names. This is reflected in the changes to `TestWillingToIssue_Wildcard`. Adds a WFE test case for receiving malformed identifiers in a new-order request. Follows up on #3323 and #7218 Fixes #7526 Some small incidental fixes: - checkWildcardHostList was checking `pa.blocklist` for `nil` before accessing `pa.wildcardExactBlocklist`. Fix that. - move table test for WillingToIssue into a new test case for WellFormedDomainNames - move two standalone test cases into the big table test	2024-06-10 13:46:55 -07:00
Aaron Gable	3786401f10	Improve redis rate limit construction errors (#7525 ) Change ratelimits.validateIdForName to call the appropriate validate function based on the contents of the to-be-validated ID, rather than falling back and potentially performing multiple validations. Previously, if an ID like "12345:bad.domain" was passed into this function, it would fail the first validateRegIdDomain validation due to having a bad domain name (no TLD), fall back to the simpler validateRegId function, and then fail that because it contains a colon. This obscured the true reason for the failure. Changing this code to not fall back means that the true reason for the id validation failure will be exposed in the error message.	2024-06-05 12:14:10 -07:00
Aaron Gable	1694f183f2	Fix source_redis_test context import (#7491 ) For unknown reasons, this one test file was importing the old context package.	2024-05-17 11:55:54 -07:00
Aaron Gable	146b78a0f7	Remove all static minica keys (#7489 ) Remove the redis-tls, wfe-tls, and mail-test-srv keys which were generated by minica and then checked in to the repo. All three are replaced by the dynamically-generated ipki directory. Part of https://github.com/letsencrypt/boulder/issues/7476	2024-05-17 11:45:40 -07:00
Samantha	16d55ef120	ratelimits: Support new Comment field for each Id entry (#7480 ) Fixes #7478	2024-05-13 14:16:51 -04:00
Samantha	5c27eadb1d	ratelimits: Remove deprecated overrides format (#7458 ) Fixes #7198	2024-05-08 15:38:21 -04:00
Samantha	a9c2fa3f69	ratelimits: Small comment for clarification (#7472 ) Fix some lacking documentation around our only "override only" limit.	2024-05-06 13:47:30 -04:00
forcedebug	b33d28c8bd	Remove repeated words in comments (#7445 ) Signed-off-by: forcedebug <forcedebug@outlook.com>	2024-04-23 10:30:33 -04:00
Aaron Gable	e05d47a10a	Replace explicit int loops with range-over-int (#7434 ) This adopts modern Go syntax to reduce the chance of off-by-one errors and remove unnecessary loop variable declarations. Fixes https://github.com/letsencrypt/boulder/issues/7227	2024-04-22 10:34:51 -07:00
Aaron Gable	a7b73450d5	Re-enable lints on go1.22 (#7412 ) We had disabled our lints on go1.22 because golangci-lint and staticcheck didn't work with some of its updates. Re-enable them, and fix the things which the updated linters catch now. Fixes https://github.com/letsencrypt/boulder/issues/7229	2024-04-04 08:14:29 -07:00
Samantha	7e5c1ca4bd	RA: Count failed authorizations using key-value rate limits (#7346 ) Part of #5545	2024-03-11 17:24:45 -04:00
Samantha	529157ce56	ratelimits: Fix transaction building for Failed Authorizations Limit (#7344 ) - Update the failed authorizations limit to use 'enum:regId:domain' for transactions while maintaining 'enum:regId' for overrides. - Modify the failed authorizations transaction builder to generate a transaction for each order name. - Rename the `FailedAuthorizationsPerAccount` enum to `FailedAuthorizationsPerDomainPerAccount` to align with its corrected implementation. This change is possible because the limit isn't yet deployed in staging or production. Blocks #7346 Part of #5545	2024-03-06 13:48:32 -05:00

1 2

66 Commits