boulder

Commit Graph

Author	SHA1	Message	Date
James Renken	3f879ed0b4	Add Identifiers to Authorization & Order structs (#7961 ) Add `identifier` fields, which will soon replace the `dnsName` fields, to: - `corepb.Authorization` - `corepb.Order` - `rapb.NewOrderRequest` - `sapb.CountFQDNSetsRequest` - `sapb.CountInvalidAuthorizationsRequest` - `sapb.FQDNSetExistsRequest` - `sapb.GetAuthorizationsRequest` - `sapb.GetOrderForNamesRequest` - `sapb.GetValidAuthorizationsRequest` - `sapb.NewOrderRequest` Populate these `identifier` fields in every function that creates instances of these structs. Use these `identifier` fields instead of `dnsName` fields (at least preferentially) in every function that uses these structs. When crossing component boundaries, don't assume they'll be present, for deployability's sake. Deployability note: Mismatched `cert-checker` and `sa` versions will be incompatible because of a type change in the arguments to `sa.SelectAuthzsMatchingIssuance`. Part of #7311	2025-03-26 10:30:24 -07:00
Aaron Gable	ebf232cccb	Return updated account object on DeactivateRegistration path (#8060 ) Update the SA to re-query the database for the updated account after deactivating it, and return this to the RA. Update the RA to pass this value through to the WFE. Update the WFE to return this value, rather than locally modifying the pre-deactivation account object, if it gets one (for deployability). Also remove the RA's requirement that the request object specify its current status so that the request can be trimmed down to just an ID. This proto change is backwards-compatible because the new DeactivateRegistrationRequest's registrationID field has the same type (int64) and field number (1) as corepb.Registration's id field. Part of https://github.com/letsencrypt/boulder/issues/5554	2025-03-14 14:17:42 -07:00
James Renken	edc3c7fa6d	Shorten "identifier(s)" in variable names & function arguments (#8066 ) For consistency, and to prevent confusion with the `identifier` package, use "ident(s)" instead. Part of #7311	2025-03-14 10:59:38 -07:00
Samantha Frank	428fcb30de	ARI: Store and reflect optional "replaces" value for Orders (#8056 ) - Plumb the "replaces" value from the WFE through to the SA via the RA - Store validated "replaces" value for new orders in the orders table - Reflect the stored "replaces" value to subscribers in the order object - Reorder CertificateProfileName before Replaces/ReplacesSerial in RA and SA protos for consistency Fixes #8034	2025-03-12 15:09:29 -04:00
Aaron Gable	28b49a82d4	SA: Improve concurrency robustness of CRL leasing transactions (#8030 ) In a few places within the SA, we use explicit transactions to wrap read-then-update style operations. Because we set the transaction isolation level on a per-session basis, these transactions do not in fact change their isolation level, and therefore generally remain at the default isolation level of REPEATABLE READ. Unfortunately, we cannot resolve this simply by converting the SELECT statements into SELECT...FOR UPDATE statements: although this would fix the issue by making those queries into locking statements, it also triggers what appears to be an InnoDB bug when many transactions all attempt to select-then-insert into a table with both a primary key and a separate unique key, as the crlShards table has. This causes the integration tests in GitHub Actions, which run with an empty database and therefore use the needToInsert codepath instead of the update codepath, to consistently flake. Instead, resolve the issue by having the UPDATE statements specify that the value of the leasedUntil column is still the same as was read by the initial SELECT. Although two crl-updaters may still attempt these transactions concurrently, the UPDATE statements will still be fully sequenced, and the latter one will fail. Part of https://github.com/letsencrypt/boulder/issues/8031	2025-03-03 15:29:57 -08:00
Aaron Gable	d9433fe293	Remove 'RETURNING' functionality from MultiInserter (#7740 ) Deprecate the "InsertAuthzsIndividually" feature flag, which has been set to true in both Staging and Production. Delete the code guarded behind that flag being false, namely the ability of the MultiInserter to return the newly-created IDs from all of the rows it has inserted. This behavior is being removed because it is not supported in MySQL / Vitess. Fixes https://github.com/letsencrypt/boulder/issues/7718 --- > [!WARNING] > ~~Do not merge until IN-10737 is complete~~	2025-02-19 14:37:22 -08:00
Aaron Gable	86ab2ed245	SA: Support profiles associated with authorizations (#7956 ) Add "certificateProfileName" to the model used to insert new authz2 rows and to the list of column names read when retrieving rows from the authz2 table. Add support for this column to the functions which convert to and from authz2 model types. Add support for the profile field to core types so that it can be returned by the SA. Fixes https://github.com/letsencrypt/boulder/issues/7955	2025-01-27 14:53:30 -08:00
Jacob Hoffman-Andrews	e0221b6bbe	crl-updater: query by explicit shard too (#7973 ) Add querying by explicit shard (SA.GetRevokedCertsByShard) in addition to querying by temporal shard (SA.GetRevokedCerts). Merge results from both kinds of shard. De-duplicate by serial within a shard, because the same certificate could wind up in a temporal shard that matches its explicit shard. When de-duplicating, validate that revocation reasons are the same or (very unlikely) represent a re-revocation based on demonstrating key compromise. This can happen because the two different SA queries occur at slightly different times. Add unit testing that CRL entries make it through the whole pipeline from SA, to CA, to uploader. Rename some types in the unittest to be more accessible. Tweak a comment in SA.UpdateRevokedCertificate to make it clear that status _and_ reason are critical for re-revocation. Note: This GetRevokedCertsByShard code path will always return zero certificates right now, because nothing is writing to the `revokedCertificates` table. Writing to that table is gated on certificates having CRL URLs in them, which is not yet implemented (and will be config-gated). Part of #7094	2025-01-27 10:11:09 -08:00
Jacob Hoffman-Andrews	a9080705b4	ra: revoke with explicit CRL shard (#7944 ) In RA.RevokedCertificate, if the certificate being revoked has a crlDistributionPoints extension, parse the URL and pass the appropriate shard to the SA. This required some changes to the `admin` tool. When a malformed certificate is revoked, we don't have a parsed copy of the certificate to extract a CRL URL from. So, specifically when a malformed certificate is being revoked, allow specifying a CRL shard. Because different certificates will have different shards, require one-at-a-time revocation for malformed certificates. To support that refactoring, move the serial-cleaning functionality earlier in the `admin` tool's flow. Also, split out one of the cases handled by the `revokeCertificate` helper in the RA. For admin malformed revocations, we need to accept a human-specified ShardIdx, so call the SA directly in that case (and skip stat increment since admin revocations aren't useful for metrics). This allows `revokeCertificate` to be a more helpful helper, by extracting serial, issuer ID, and CRL shard automatically from an `*x509.Certificate`. Note: we don't yet issue certificates with the crlDistributionPoints extension, so this code will not be active until we start doing so. Part of #7094.	2025-01-21 21:31:40 -08:00
Aaron Gable	6b1e7f04e8	SA: Clean up pre-profile order schema and feature flag (#7953 ) Deprecate the MultipleCertificateProfiles feature flag, which has been enabled in both Staging and Prod. Delete all code protected by that flag being false, namely the orderModelv1 type and its support code. Update the config schema to match the config-next schema. Fixes https://github.com/letsencrypt/boulder/issues/7324 Fixes https://github.com/letsencrypt/boulder/issues/7408	2025-01-17 17:15:01 -08:00
James Renken	2e1f733c26	ra/sa: Remove deprecated UpdateRegistration methods (#7911 ) This is the final stage of #5554: removing the old, combined `UpdateRegistration` flow, which has been replaced by `UpdateRegistrationContact` and `UpdateRegistrationKey`. Those new functions have their own tests. The RA's `UpdateRegistration` function no longer has any callers (as of #7827's deployment), so it is safely deployable to remove it from the SA too, and its request from gRPC. Fixes #5554 --------- Co-authored-by: Jacob Hoffman-Andrews <jsha+github@letsencrypt.org> Co-authored-by: Aaron Gable <aaron@letsencrypt.org>	2025-01-14 13:54:06 -08:00
James Renken	e4668b4ca7	Deprecate DisableLegacyLimitWrites & UseKvLimitsForNewOrder flags; remove code using certificatesPerName & newOrdersRL tables (#7858 ) Remove code using `certificatesPerName` & `newOrdersRL` tables. Deprecate `DisableLegacyLimitWrites` & `UseKvLimitsForNewOrder` flags. Remove legacy `ratelimit` package. Delete these RA test cases: - `TestAuthzFailedRateLimitingNewOrder` (rl: `FailedAuthorizationsPerDomainPerAccount`) - `TestCheckCertificatesPerNameLimit` (rl: `CertificatesPerDomain`) - `TestCheckExactCertificateLimit` (rl: `CertificatesPerFQDNSet`) - `TestExactPublicSuffixCertLimit` (rl: `CertificatesPerDomain`) Rate limits in NewOrder are now enforced by the WFE, starting here: `5a9b4c4b18/wfe2/wfe.go (L781)` We collect a batch of transactions to check limits, check them all at once, go through and find which one(s) failed, and serve the failure with the Retry-After that's furthest in the future. All this code doesn't really need to be tested again; what needs to be tested is that we're returning the correct failure. That code is `NewOrderLimitTransactions`, and the `ratelimits` package's tests cover this. The public suffix handling behavior is tested by `TestFQDNsToETLDsPlusOne`: `5a9b4c4b18/ratelimits/utilities_test.go (L9)` Some other RA rate limit tests were deleted earlier, in #7869. Part of #7671.	2025-01-10 12:50:57 -08:00
Jacob Hoffman-Andrews	635f43266a	use core.IsAnyNilOrZero more places (#7925 ) There were a bunch of places that had `TODO(#7153)`; that issue is now closed, so let's tidy up.	2025-01-07 15:48:47 -08:00
Aaron Gable	442d152b72	Fix orderModelv2 for nullable profile column (#7907 ) Change the type of the orderModelv2 CertificateProfileName field to be a pointer to a string, reflecting the fact that the underlying database column is nullable. Add tests to ensure that order rows inserted with either order model can be read using the other model. Fixes https://github.com/letsencrypt/boulder/issues/7906	2025-01-06 13:26:11 -08:00
Aaron Gable	0c658f202a	Fix error when deactivating an account (#7899 ) The RA's DeactivateAccount method expects the account provided to it by the WFE to still have status Valid. The new WFE deactivation code was hardcoding the status to Deactivated. Fix the WFE to pass the account's current status instead. Add an integration test to confirm both the breakage and the fix. Also leave behind some TODOs to simplify this codepath further, and not require the status to be provided at all. Part of #5554	2024-12-18 10:06:08 -08:00
Samantha Frank	a8cdaf8989	ratelimit: Remove legacy registrations per IP implementation (#7760 ) Part of #7671	2024-11-19 18:39:21 -05:00
James Renken	6a2819a95a	Introduce separate UpdateRegistrationContact & UpdateRegistrationKey methods in RA & SA (#7735 ) Introduce separate UpdateRegistrationContact & UpdateRegistrationKey methods in RA & SA Clear contact field during DeactivateRegistration Part of #7716 Part of #5554	2024-11-06 10:07:31 -08:00
Jacob Hoffman-Andrews	e182d889b2	sa: document the storage of linting certificates (#7772 ) The naming of our `precertificates` table (now used to store linting certificates) is definitely confusing, so add some more comments in various places explaining. See #6807.	2024-10-28 10:23:39 -07:00
Samantha Frank	6c85b8d019	wfe/sa/features: Deprecate TrackReplacementCertificatesARI (#7766 )	2024-10-24 13:38:33 -04:00
James Renken	b0bcbb12aa	SA: Create list of authzIDs earlier in NewOrderAndAuthzs (#7744 ) Creating the list of authzIDs earlier in NewOrderAndAuthzs: - Saves a `for` loop with duplicated code; we no longer need to range over two different slices, just one. - Allows us to create the Order PB later, after more of the data collection logic, without interrupting it. This makes the order of operations slightly easier to follow.	2024-10-10 09:55:02 -07:00
Aaron Gable	7b032a663f	Add feature flag to remove use of "INSERT RETURNING" in NewOrderAndAuthzs (#7739 ) This is our only use of MariaDB's "INSERT ... RETURNING" syntax, which does not exist in MySQL and Vitess. Add a feature flag which removes our use of this feature, so that we can easily disable it and then re-enable it if it turns out to be too much of a performance hit. Also add a benchmark showing that the serial-insertion approach is slower, but perhaps not debilitatingly so. Part of https://github.com/letsencrypt/boulder/issues/7718	2024-10-04 14:56:44 -07:00
Samantha Frank	2fa9fbcd23	SA: Add feature flag DisableLegacyLimitWrites (#7728 )	2024-09-30 14:09:40 -04:00
Aaron Gable	e1790a5a02	Remove deprecated sapb.NewAuthzRequest fields (#7651 ) Remove the id, identifierValue, status, and challenges fields from sapb.NewAuthzRequest. These fields were left behind from the previous corepb.Authorization request type, and are now being ignored by the SA. Since the RA is no longer constructing full challenge objects to include in the request, remove pa.ChallengesFor and replace it with the much simpler pa.ChallengeTypesFor. Part of https://github.com/letsencrypt/boulder/issues/5913	2024-08-15 15:35:10 -07:00
Aaron Gable	46859a22d9	Use consistent naming for dnsName gRPC fields (#7654 ) Find all gRPC fields which represent DNS Names -- sometimes called "identifier", "hostname", "domain", "identifierValue", or other things -- and unify their naming. This naming makes it very clear that these values are strings which may be included in the SAN extension of a certificate with type dnsName. As we move towards issuing IP Address certificates, all of these fields will need to be replaced by fields which carry both an identifier type and value, not just a single name. This unified naming makes it very clear which messages and methods need to be updated to support non-dnsName identifiers. Part of https://github.com/letsencrypt/boulder/issues/7647	2024-08-12 14:32:55 -07:00
Aaron Gable	35b0b55453	Improve how we create new authorizations (#7643 ) Within the NewOrderAndAuthzsRequest, replace the corepb.Authorization field with a new sapb.NewAuthzRequest message. This message has all of the same field types and numbers, and the RA still populates all of these fields when constructing a request, for backwards compatibility. But it also has new fields (an Identifier carrying both type and value, a list of challenge types, and a challenge token) which the RA preferentially consumes if present. This causes the content of our NewOrderAndAuthzsRequest to more closely match the content that will be created at the database layer. Although this may seem like a step backwards in terms of abstraction, it is also a step forwards in terms of both efficiency (not having to transmit multiple nearly-identical challenge objects) and correctness (being guaranteed that the token is actually identical across all challenges). After this change is deployed, it will be followed by a change which removes the old fields from the NewAuthzRequest message, to realize the efficiency gains. Part of https://github.com/letsencrypt/boulder/issues/5913	2024-08-08 10:15:46 -07:00
Samantha Frank	c13591ab82	SFE: Call RA.UnpauseAccount and handle result (#7638 ) Call `RA.UnpauseAccount` for valid unpause form submissions. Determine and display the appropriate outcome to the Subscriber based on the count returned by `RA.UnpauseAccount`: - If the count is zero, display the "Account already unpaused" message. - If the count equals the max number of identifiers allowed in a single request, display a page explaining the need to visit the unpause URL again. - Otherwise, display the "Successfully unpaused all N identifiers" message. Apply per-request timeout from the SFE configuration. Part of https://github.com/letsencrypt/boulder/issues/7406	2024-07-31 14:46:46 -04:00
Samantha Frank	15ad9fc5ab	sa: Provide a grace period for recently unpaused identifiers (#7573 ) SA method PauseIdentifiers skips identifiers unpaused within the last 2 weeks, providing a grace period for operators to fix configuration issues resulting in numerous contiguous validation failures. Part of #7475	2024-07-11 12:11:27 -04:00
Samantha Frank	63452d5afe	sa: Avoid database timeouts in UnpauseAccount (#7572 ) SA method UnpauseAccount uses up to 5 `UPDATE` query iterations, each with a `LIMIT` of 10000, to unpause up to 50000 identifiers and returns a count of identifiers unpaused. Part of #7475	2024-07-10 10:41:51 -04:00
Jacob Hoffman-Andrews	3baea4356f	Revert "sa: truncate all timestamps to seconds (#7519 )" (#7559 ) This reverts commit `2b5b6239a4`. Following up on #7556, after we made a more systematic change to use borp's TypeConverter, we no longer need to manually truncate timestamps.	2024-06-26 17:25:05 -07:00
Samantha Frank	de9c06129b	SA: Last round of comments for PR #7490 (#7551 ) Fixes #7550 Part of #7406 Part of #7475	2024-06-20 12:56:39 -04:00
Samantha	594cb1332f	SA: Implement schema and methods for (account, hostname) pausing (#7490 ) Add the storage implementation for our new (account, hostname) pair pausing feature. - Add schema and model for for the new paused table - Add SA service methods for interacting with the paused table Part of #7406 Part of #7475	2024-06-17 10:18:10 -04:00
Jacob Hoffman-Andrews	2b5b6239a4	sa: truncate all timestamps to seconds (#7519 ) As described in #7075, go-sql-driver/mysql v1.5.0 truncates timestamps to microseconds, while v1.6.0 and above does not. That means upon upgrading to v1.6.0, timestamps are written to the database with a resolution of nanoseconds, and SELECT statements also use a resolution of nanoseconds. We believe this is the cause of performance problems we observed when upgrading to v1.6.0 and above. To fix that, apply rounding in the application code. Rather than just rounding to microseconds, round to seconds since that is the resolution we care about. Using seconds rather than microseconds may also allow some of our indexes to grow more slowly over time. Note: this omits truncating some timestamps in CRL shard calculations, since truncating those resulted in test failures that I'll follow up on separately.	2024-06-12 15:00:24 -07:00
Aaron Gable	b92581d620	Better compile-time type checking for gRPC server implementations (#7504 ) Replaced our embeds of foopb.UnimplementedFooServer with foopb.UnsafeFooServer. Per the grpc-go docs this reduces the "forwards compatibility" of our implementations, but that is only a concern for codebases that are implementing gRPC interfaces maintained by third parties, and which want to be able to update those third-party dependencies without updating their own implementations in lockstep. Because we update our protos and our implementations simultaneously, we can remove this safety net to replace runtime type checking with compile-time type checking. However, that replacement is not enough, because we never pass our implementation objects to a function which asserts that they match a specific interface. So this PR also replaces our reflect-based unittests with idiomatic interface assertions. I do not view this as a perfect solution, as it relies on people implementing new gRPC servers to add this line, but it is no worse than the status quo which relied on people adding the "TestImplementation" test. Fixes https://github.com/letsencrypt/boulder/issues/7497	2024-05-28 09:26:29 -07:00
Aaron Gable	89213f9214	Use generic types for gRPC stream implementations (#7501 ) Update the version of protoc-gen-go-grpc that we use to generate Go gRPC code from our proto files, and update the versions of other gRPC tools and libraries that we use to match. Turn on the new `use_generic_streams` code generation flag to change how protoc-gen-go-grpc generates implementations of our streaming methods, from creating a wholly independent implementation for every stream to using shared generic implementations. Take advantage of this code-sharing to remove our SA "wrapper" methods, now that they have truly the same signature as the SARO methods which they wrap. Also remove all references to the old-style stream names (e.g. foopb.FooService_BarMethodClient) and replace them with the new underlying generic names, for the sake of consistency. Finally, also remove a few custom stream test mocks, replacing them with the generic mocks.ServerStreamClient. Note that this PR does not change the names in //mocks/sa.go, to avoid conflicts with work happening in the pursuit of https://github.com/letsencrypt/boulder/issues/7476. Note also that this PR updates the version of protoc-gen-go-grpc that we use to a specific commit. This is because, although a new release of grpc-go itself has been cut, the codegen binary is a separate Go module with its own releases, and it hasn't had a new release cut yet. Tracking for that is in https://github.com/grpc/grpc-go/issues/7030.	2024-05-24 13:54:25 -07:00
Jacob Hoffman-Andrews	e75a821cc9	sa: eliminate requestedNames table (#7471 ) Part of https://github.com/letsencrypt/boulder/issues/7432 Follows up on https://github.com/letsencrypt/boulder/pull/7435, now that that PR is deployed.	2024-05-06 09:15:51 -07:00
Jacob Hoffman-Andrews	89b07f4543	sa: get order names from authorizations (#7435 ) This removes the only place we query the requestedNames table, which allows us to get rid of it in a subsequent PR (once this one is merged and deployed). Part of https://github.com/letsencrypt/boulder/issues/7432	2024-04-18 14:00:53 -07:00
Phil Porada	8556eaedca	SA: store and return certificate profile name (#7352 ) Adds `certificateProfileName` to the `orders` database table. The [maximum length](https://github.com/letsencrypt/boulder/pull/7325/files#diff-a64a0af7cbf484da8e6d08d3eefdeef9314c5d9888233f0adcecd21b800102acR35) of a profile name matches the `//issuance` package. Adds a `MultipleCertificateProfiles` feature flag that, when enabled, will store the certificate profile name from a `NewOrderRequest`. The certificate profile name is allowed to be empty and the database will treat that row as [NULL](https://mariadb.com/kb/en/null-values/). When the SA retrieves this potentially NULL row, it will be cast as the golang string zero value `""`. SRE ticket IN-10145 has been filed to perform the database migration and enable the new feature flag. The migration must be performed before enabling the feature flag. Part of https://github.com/letsencrypt/boulder/issues/7324	2024-03-20 13:08:31 -04:00
Samantha	5e68cbe552	WFE: Gate ARI limit exemption and replacement tracking on a feature flag (#7383 ) Gate checking of replacement orders and exemption for ARI replacements on the `TrackReplacementCertificatesARI` feature flag.	2024-03-18 12:22:01 -04:00
Samantha	f10abd27eb	SA/ARI: Add method of tracking certificate replacement (#7284 ) Part of #6732 Part of #7038	2024-02-08 14:19:29 -05:00
Aaron Gable	164e035915	Reduce logging from inflight validation collisions (#7209 ) If a client attempts to validate a challenge twice in rapid succession, we'll kick off two background validation routines. One of these will complete first, updating the database with success or failure. The other will fail when it attempts to update the database and finds that there are no longer any authorizations with that ID in the "pending" state. Reduce the level at which we log such events, since we don't particularly care about them. Fixes https://github.com/letsencrypt/boulder/issues/3995	2023-12-15 09:58:34 -08:00
Phil Porada	6925fad324	Finish migration from int64 timestamps to timestamppb (#7142 ) This is a cleanup PR finishing the migration from int64 timestamps to protobuf `timestamppb.Timestamps` by removing all usage of the old int64 fields. In the previous PR https://github.com/letsencrypt/boulder/pull/7121 all fields were switched to read from the protobuf timestamppb fields. Adds a new case to `core.IsAnyNilOrZero` to check various properties of a `timestamppb.Timestamp` reducing the visual complexity for receivers. Fixes https://github.com/letsencrypt/boulder/issues/7060	2023-11-27 13:37:31 -08:00
Phil Porada	a5c2772004	Add and populate new protobuf Timestamp fields (#7070 ) * Adds new `google.protobuf.Timestamp` fields to each .proto file where we had been using `int64` fields as a timestamp. * Updates relevant gRPC messages to populate the new `google.protobuf.Timestamp` fields in addition to the old `int64` timestamp fields. * Added tests for each `<x>ToPB` and `PBto<x>` functions to ensure that new fields passed into a gRPC message arrive as intended. * Removed an unused error return from `PBToCert` and `PBToCertStatus` and cleaned up each call site. Built on-top of https://github.com/letsencrypt/boulder/pull/7069 Part 2 of 4 related to https://github.com/letsencrypt/boulder/issues/7060	2023-10-11 12:12:12 -04:00
Aaron Gable	bab048d221	SA: Add and use revokedCertificates table (#7095 ) Add a new "revokedCertificates" table to the database schema. This table is similar to the existing "certificateStatus" table in many ways, but the idea is that it will only have rows added to it when certificates are revoked, not when they're issued. Thus, it will grow many orders of magnitude slower than the certificateStatus table does. Eventually, it will replace that table entirely. The one column that revokedCertificates adds is the new "ShardIdx" column, which is the CRL shard in which the revoked certificate will appear. This way we can assign certificates to CRL shards at the time they are revoked, and guarantee that they will never move to a different shard even if we change the number of shards we produce. This will eventually allow us to put CRL URLs directly into our certificates, replacing OCSP URLs. Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate methods to handle this new table. If these methods receive a request which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0 does not exist), then they will ensure that the new revocation status is written into both the certificateStatus and revokedCertificates tables. This logic will not function until the RA is updated to take advantage of it, so it is not a risk for it to appear in Boulder before the new table has been created. Also add new logic to the SA's GetRevokedCertificates method. Similar to the above, this reads from the new table if the ShardIdx field is supplied in the request message. This code will not operate until the crl-updater is updated to include this field. We will not perform this update for a minimum of 100 days after this code is deployed, to ensure that all unexpired revoked certificates are present in the revokedCertificates table. Part of https://github.com/letsencrypt/boulder/issues/7094	2023-10-02 10:21:14 -07:00
Phil Porada	034316ef6a	Rename int64 timestamp related protobuf fields to <fieldname>NS (#7069 ) Rename all of int64 timestamp fields to `<fieldname>NS` to indicate they are Unix nanosecond timestamps. Part 1 of 4 related to https://github.com/letsencrypt/boulder/issues/7060	2023-09-15 13:49:07 -04:00
Aaron Gable	7bed24a401	SA: Fix two bugs in UpdateCRLShard (#7052 ) The NextUpdate field should not be required, as it is not necessary for tracking and preventing duplicate work between multiple crl-updater instances. The ThisUpdate conditional needs explicit handling for NULL to ensure that it updates correctly.	2023-08-31 12:06:33 -04:00
Aaron Gable	6a450a2272	Improve CRL shard leasing (#7030 ) Simplify the index-picking logic in the SA's leaseOldestCrlShard method. Specifically, more clearly separate it into "missing" and "non-missing" cases, which require entirely different logic: picking a random missing shard, or picking the oldest unleased shard, respectively. Also change the UpdateCRLShard method to "unlease" shards when they're updated. This allows the crl-updater to run as quickly as it likes, while still ensuring that multiple instances do not step on each other's toes. The config change for shardWidth and lookbackPeriod instead of certificateLifetime has been deployed in prod since IN-8445. The config change changing the shardWidth is just so that the tests neither produce a bazillion shards, nor have to do a bazillion SA queries for each chunk within a shard, improving the readability of test logs. Part of https://github.com/letsencrypt/boulder/issues/7023	2023-08-08 17:05:00 -07:00
Jacob Hoffman-Andrews	38fc840184	sa: refactor how metrics and logging are set up (#7031 ) This eliminates the need for a pair of accessors on `db.WrappedMap` that expose the underlying `sql.DB` and `borp.DbMap`. Fixes #6991	2023-08-08 09:51:23 -07:00
Aaron Gable	908421bb98	crl-updater: lease CRL shards to prevent races (#6941 ) Add a new feature flag, LeaseCRLShards, which controls certain aspects of crl-updater's behavior. When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard method before beginning work on a shard. This prevents it from stepping on the toes of another crl-updater instance which may be working on the same shard. This is important to prevent two competing instances from accidentally updating a CRL's Number (which is an integer representation of its thisUpdate timestamp) backwards, which would be a compliance violation. When this flag is enabled, crl-updater also calls the new SA.UpdateCRLShard method after finishing work on a shard. In the future, additional work will be done to make crl-updater use the "give me the oldest available shard" mode of the LeaseCRLShard method. Fixes https://github.com/letsencrypt/boulder/issues/6897	2023-07-19 15:11:16 -07:00
Jacob Hoffman-Andrews	7d66d67054	It's borpin' time! (#6982 ) This change replaces [gorp] with [borp]. The changes consist of a mass renaming of the import and comments / doc fixups, plus modifications of many call sites to provide a context.Context everywhere, since gorp newly requires this (this was one of the motivating factors for the borp fork). This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and `github.com/letsencrypt/boulder/db.Transaction` to not embed their underlying gorp/borp objects, but to have them as plain fields. This ensures that we can only call methods on them that are specifically implemented in `github.com/letsencrypt/boulder/db`, so we don't miss wrapping any. This required introducing a `NewWrappedMap` method along with accessors `SQLDb()` and `BorpDB()` to get at the internal fields during metrics and logging setup. Fixes #6944	2023-07-17 14:38:29 -07:00
Aaron Gable	3d80d8505e	SA: gRPC methods for leasing CRL shards (#6940 ) Add two new methods, LeaseCRLShard and UpdateCRLShard, to the SA gRPC interface. These methods work in concert both to prevent multiple instances of crl-updater from stepping on each others toes, and to lay the groundwork for a less bursty version of crl-updater in the future. Introduce a new database table, crlShards, which tracks the thisUpdate and nextUpdate timestamps of each CRL shard for each issuer. It also has a column "leasedUntil", which is also a timestamp. Grant the SA user read-write access to this table. LeaseCRLShard updates the leasedUntil column of the identified shard to the given time. It returns an error if the identified shard's leasedUntil timestamp is already in the future. This provides a mechanism for crl-updater instances to "lick the cookie", so to speak, marking CRL shards as "taken" so that multiple crl-updater instances don't attempt to work on the same shard at the same time. Using a timestamp has the added benefit that leases are guaranteed to expire, ensuring that we don't accidentally fail to work on a shard forever. LeaseCRLShard has a second mode of operation, when a range of potential shards is given in the request, rather than a single shard. In this mode, it returns the shard (within the given range) whose thisUpdate timestamp is oldest. (Shards with no thisUpdate timestamp, including because the requested range includes shard indices the database doesn't yet know about, count as older than any shard with any thisUpdate timestamp.) This allows crl-updater instances which don't care which shard they're working on to do the most urgent work first. UpdateCRLShard updates the thisUpdate and nextUpdate timestamps of the identified shard. This closes the loop with the second mode of LeaseCRLShard above: by updating the thisUpdate timestamp, the method marks the shard as no longer urgently needing to be worked on. IN-9220 tracks creating this table in staging and production Part of #6897	2023-06-26 15:39:13 -07:00

1 2 3 4 5 ...

295 Commits