Add support for managing and querying rate limit overrides in the
database.
- Add `sa.AddRateLimitOverride` to insert or update a rate limit
override. This will be used during Rate Limit Override Portal to commit
approved overrides to the database.
- Add `sa.DisableRateLimitOverride` and `sa.EnableRateLimitOverride` to
toggle override state. These will be used by the `admin` tool.
- Add `sa.GetRateLimitOverride` to retrieve a single override by limit
enum and bucket key. This will be used by the Rate Limit Portal to
prevent duplicate or downgrade requests but allow upgrade requests.
- Add `sa.GetEnabledRateLimitOverrides` to stream all currently enabled
overrides. This will be used by the rate limit consumers (`wfe` and
`ra`) to refresh the overrides in-memory.
- Implement test coverage for all new methods.
Remove code using `certificatesPerName` & `newOrdersRL` tables.
Deprecate `DisableLegacyLimitWrites` & `UseKvLimitsForNewOrder` flags.
Remove legacy `ratelimit` package.
Delete these RA test cases:
- `TestAuthzFailedRateLimitingNewOrder` (rl:
`FailedAuthorizationsPerDomainPerAccount`)
- `TestCheckCertificatesPerNameLimit` (rl: `CertificatesPerDomain`)
- `TestCheckExactCertificateLimit` (rl: `CertificatesPerFQDNSet`)
- `TestExactPublicSuffixCertLimit` (rl: `CertificatesPerDomain`)
Rate limits in NewOrder are now enforced by the WFE, starting here:
5a9b4c4b18/wfe2/wfe.go (L781)
We collect a batch of transactions to check limits, check them all at
once, go through and find which one(s) failed, and serve the failure
with the Retry-After that's furthest in the future. All this code
doesn't really need to be tested again; what needs to be tested is that
we're returning the correct failure. That code is
`NewOrderLimitTransactions`, and the `ratelimits` package's tests cover
this.
The public suffix handling behavior is tested by
`TestFQDNsToETLDsPlusOne`:
5a9b4c4b18/ratelimits/utilities_test.go (L9)
Some other RA rate limit tests were deleted earlier, in #7869.
Part of #7671.
Add the storage implementation for our new (account, hostname) pair
pausing feature.
- Add schema and model for for the new paused table
- Add SA service methods for interacting with the paused table
Part of #7406
Part of #7475
Cert checker has been failing in config-next because the
"CertCheckerRequiresCorrespondence" flag is enabled, but no permissions
were granted on the precertificates table. While this did cause it to
log profusively, this did not cause the tests to fail because
integration-test.py simply executes cert-checker without caring about
its exit code.
Add a new "revokedCertificates" table to the database schema. This table
is similar to the existing "certificateStatus" table in many ways, but
the idea is that it will only have rows added to it when certificates
are revoked, not when they're issued. Thus, it will grow many orders of
magnitude slower than the certificateStatus table does. Eventually, it
will replace that table entirely.
The one column that revokedCertificates adds is the new "ShardIdx"
column, which is the CRL shard in which the revoked certificate will
appear. This way we can assign certificates to CRL shards at the time
they are revoked, and guarantee that they will never move to a different
shard even if we change the number of shards we produce. This will
eventually allow us to put CRL URLs directly into our certificates,
replacing OCSP URLs.
Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate
methods to handle this new table. If these methods receive a request
which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0
does not exist), then they will ensure that the new revocation status is
written into both the certificateStatus and revokedCertificates tables.
This logic will not function until the RA is updated to take advantage
of it, so it is not a risk for it to appear in Boulder before the new
table has been created.
Also add new logic to the SA's GetRevokedCertificates method. Similar to
the above, this reads from the new table if the ShardIdx field is
supplied in the request message. This code will not operate until the
crl-updater is updated to include this field. We will not perform this
update for a minimum of 100 days after this code is deployed, to ensure
that all unexpired revoked certificates are present in the
revokedCertificates table.
Part of https://github.com/letsencrypt/boulder/issues/7094
Add two new methods, LeaseCRLShard and UpdateCRLShard, to the SA gRPC
interface. These methods work in concert both to prevent multiple
instances of crl-updater from stepping on each others toes, and to lay
the groundwork for a less bursty version of crl-updater in the future.
Introduce a new database table, crlShards, which tracks the thisUpdate
and nextUpdate timestamps of each CRL shard for each issuer. It also has
a column "leasedUntil", which is also a timestamp. Grant the SA user
read-write access to this table.
LeaseCRLShard updates the leasedUntil column of the identified shard to
the given time. It returns an error if the identified shard's
leasedUntil timestamp is already in the future. This provides a
mechanism for crl-updater instances to "lick the cookie", so to speak,
marking CRL shards as "taken" so that multiple crl-updater instances
don't attempt to work on the same shard at the same time. Using a
timestamp has the added benefit that leases are guaranteed to expire,
ensuring that we don't accidentally fail to work on a shard forever.
LeaseCRLShard has a second mode of operation, when a range of potential
shards is given in the request, rather than a single shard. In this
mode, it returns the shard (within the given range) whose thisUpdate
timestamp is oldest. (Shards with no thisUpdate timestamp, including
because the requested range includes shard indices the database doesn't
yet know about, count as older than any shard with any thisUpdate
timestamp.) This allows crl-updater instances which don't care which
shard they're working on to do the most urgent work first.
UpdateCRLShard updates the thisUpdate and nextUpdate timestamps of the
identified shard. This closes the loop with the second mode of
LeaseCRLShard above: by updating the thisUpdate timestamp, the method
marks the shard as no longer urgently needing to be worked on.
IN-9220 tracks creating this table in staging and production
Part of #6897
When a user wants their email address deleted from the database but no
longer has access to their account, this allows an administrator to
clear it.
This adds `admin` as an alias for `admin-revoker`, because we'd like the
clear-email sub-command to be a part of that overall tool, but it's not
really revocation related.
Part of #6864
Delete the ocsp-updater service, and the //ocsp/updater library that
supports it. Remove test configs for the service, and remove references
to the service from other test files.
This service has been fully shut down for an extended period now, and is
safe to remove.
Fixes#6499
Add an upstream ProxySQL container to our docker-compose. Configure
ProxySQL to manage database connections for our unit and integration
tests.
Fixes#5873
This includes two feature flags: one that controls turning on the extra
database queries, and one that causes cert-checker to fail on missing
validations. If the second flag isn't turned on, it will just emit error
log lines. This will help us find any edge conditions we need to deal
with before making the new code trigger alerts.
Fixes#6562
Create a new gRPC service named StorageAuthorityReadOnly which only
exposes a read-only subset of the existing StorageAuthority service's
methods.
Implement this by splitting the existing SA in half, and having the
read-write half embed and wrap an instance of the read-only half.
Unfortunately, many of our tests use exported read-write methods as part
of their test setup, so the tests are all being performed against the
read-write struct, but they are exercising the same code as the
read-only implementation exposes.
Expose this new service at the SA on the same port as the existing
service, but with (in config-next) different sets of allowed clients. In
the future, read-only clients will be removed from the read-write
service's set of allowed clients.
Part of #6454
- Move incidents tables from `boulder_sa` to `incidents_sa` (added in #6344)
- Grant read perms for all tables in `incidents_sa`
- Modify unit tests to account for new schema and grants
- Add database cleaning func for `boulder_sa`
- Adjust cleanup funcs to omit `sql-migrate` tables instead of `goose`
Resolves#6328
In dev docker we've always used a single schema (`boulder_sa`), with two
environments (`test` and `integration`) making for a combined total of two
databases sharing the same users and schema (e.g. `boulder_sa_test` and
`boulder_sa_integration`). There are also two versions of this schema. `db` and
`db-next`. The former is the schema as it should exist in production and the
latter is everything from `db` with some un-deployed schema changes. This change
adds support for additional schemas with the same aforementioned environments
and versions.
- Add support for additional schemas in `test/create_db.sh` and sa/migrations.sh
- Add new schema `incidents_sa` with its own users
- Replace `bitbucket.org/liamstask/goose/` with `github.com/rubenv/sql-migrate`
Part of #6328