boulder

Commit Graph

Author	SHA1	Message	Date
Aaron Gable	164e035915	Reduce logging from inflight validation collisions (#7209 ) If a client attempts to validate a challenge twice in rapid succession, we'll kick off two background validation routines. One of these will complete first, updating the database with success or failure. The other will fail when it attempts to update the database and finds that there are no longer any authorizations with that ID in the "pending" state. Reduce the level at which we log such events, since we don't particularly care about them. Fixes https://github.com/letsencrypt/boulder/issues/3995	2023-12-15 09:58:34 -08:00
Aaron Gable	21b18667b2	Remove static test certs from SA unittests (#7217 ) Fixes https://github.com/letsencrypt/boulder/issues/6279	2023-12-15 07:36:59 -08:00
Phil Porada	51e9f39259	Finish migration from int64 durations to durationpb (#7147 ) This is a cleanup PR finishing the migration from int64 durations to protobuf `*durationpb.Duration` by removing all usage of the old int64 fields. In the previous PR https://github.com/letsencrypt/boulder/pull/7146 all fields were switched to read from the protobuf durationpb fields. Fixes https://github.com/letsencrypt/boulder/issues/7097	2023-11-28 12:51:11 -05:00
Phil Porada	6925fad324	Finish migration from int64 timestamps to timestamppb (#7142 ) This is a cleanup PR finishing the migration from int64 timestamps to protobuf `timestamppb.Timestamps` by removing all usage of the old int64 fields. In the previous PR https://github.com/letsencrypt/boulder/pull/7121 all fields were switched to read from the protobuf timestamppb fields. Adds a new case to `core.IsAnyNilOrZero` to check various properties of a `timestamppb.Timestamp` reducing the visual complexity for receivers. Fixes https://github.com/letsencrypt/boulder/issues/7060	2023-11-27 13:37:31 -08:00
Phil Porada	279a4d539d	Read from durationpb instead of int64 durations (#7146 ) Switch to reading grpc duration values from the new durationpb protofbuf fields, completely ignoring the old int64 fields. Part 2 of 3 for https://github.com/letsencrypt/boulder/issues/7097	2023-11-13 12:23:46 -05:00
Aaron Gable	f24ec910ef	Further simplifications to test.ThrowAwayCert (#7129 ) Remove ThrowAwayCert's nameCount argument, since it is always set to 1 by all callers. Remove ThrowAwayCertWithSerial, because it has no callers. Change the throwaway cert's key from RSA512 to ECDSA P-224 for a two-orders-of-magnitude speedup in key generation. Use this simplified form in two new places in the RA that were previously rolling their own test certs.	2023-11-02 09:45:56 -07:00
Aaron Gable	3a3e32514c	Give throwaway test certs reasonable validity intervals (#7128 ) Add a new clock argument to the test-only ThrowAwayCert function, and use that clock to generate reasonable notBefore and notAfter timestamps in the resulting throwaway test cert. This is necessary to easily test functions which rely on the expiration timestamp of the certificate, such as upcoming work about computing CRL shards. Part of https://github.com/letsencrypt/boulder/issues/7094	2023-11-01 15:24:43 -07:00
Phil Porada	b8b105453a	Rename protobuf duration fields to <fieldname>NS and populate new duration fields (#7115 ) * Renames all of int64 as a time.Duration fields to `<fieldname>NS` to indicate they are Unix nanoseconds. * Adds new `google.protobuf.Duration` fields to each .proto file where we previously had been using an int64 field to populate a time.Duration. * Updates relevant gRPC messages. Part 1 of 3 for https://github.com/letsencrypt/boulder/issues/7097	2023-10-26 10:46:03 -04:00
Phil Porada	a5c2772004	Add and populate new protobuf Timestamp fields (#7070 ) * Adds new `google.protobuf.Timestamp` fields to each .proto file where we had been using `int64` fields as a timestamp. * Updates relevant gRPC messages to populate the new `google.protobuf.Timestamp` fields in addition to the old `int64` timestamp fields. * Added tests for each `<x>ToPB` and `PBto<x>` functions to ensure that new fields passed into a gRPC message arrive as intended. * Removed an unused error return from `PBToCert` and `PBToCertStatus` and cleaned up each call site. Built on-top of https://github.com/letsencrypt/boulder/pull/7069 Part 2 of 4 related to https://github.com/letsencrypt/boulder/issues/7060	2023-10-11 12:12:12 -04:00
Aaron Gable	bab048d221	SA: Add and use revokedCertificates table (#7095 ) Add a new "revokedCertificates" table to the database schema. This table is similar to the existing "certificateStatus" table in many ways, but the idea is that it will only have rows added to it when certificates are revoked, not when they're issued. Thus, it will grow many orders of magnitude slower than the certificateStatus table does. Eventually, it will replace that table entirely. The one column that revokedCertificates adds is the new "ShardIdx" column, which is the CRL shard in which the revoked certificate will appear. This way we can assign certificates to CRL shards at the time they are revoked, and guarantee that they will never move to a different shard even if we change the number of shards we produce. This will eventually allow us to put CRL URLs directly into our certificates, replacing OCSP URLs. Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate methods to handle this new table. If these methods receive a request which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0 does not exist), then they will ensure that the new revocation status is written into both the certificateStatus and revokedCertificates tables. This logic will not function until the RA is updated to take advantage of it, so it is not a risk for it to appear in Boulder before the new table has been created. Also add new logic to the SA's GetRevokedCertificates method. Similar to the above, this reads from the new table if the ShardIdx field is supplied in the request message. This code will not operate until the crl-updater is updated to include this field. We will not perform this update for a minimum of 100 days after this code is deployed, to ensure that all unexpired revoked certificates are present in the revokedCertificates table. Part of https://github.com/letsencrypt/boulder/issues/7094	2023-10-02 10:21:14 -07:00
Phil Porada	034316ef6a	Rename int64 timestamp related protobuf fields to <fieldname>NS (#7069 ) Rename all of int64 timestamp fields to `<fieldname>NS` to indicate they are Unix nanosecond timestamps. Part 1 of 4 related to https://github.com/letsencrypt/boulder/issues/7060	2023-09-15 13:49:07 -04:00
Aaron Gable	a70fc604a3	Use go1.21's stdlib slices package (#7074 ) As of go1.21, there's a new standard library package which provides basically the same (generic!) methods as the x/exp/slices package has been. Now that we're on go1.21, let's use the more stable package. Fixes https://github.com/letsencrypt/boulder/issues/6951 Fixes https://github.com/letsencrypt/boulder/issues/7032	2023-09-08 13:46:46 -07:00
Aaron Gable	7bed24a401	SA: Fix two bugs in UpdateCRLShard (#7052 ) The NextUpdate field should not be required, as it is not necessary for tracking and preventing duplicate work between multiple crl-updater instances. The ThisUpdate conditional needs explicit handling for NULL to ensure that it updates correctly.	2023-08-31 12:06:33 -04:00
Aaron Gable	6a450a2272	Improve CRL shard leasing (#7030 ) Simplify the index-picking logic in the SA's leaseOldestCrlShard method. Specifically, more clearly separate it into "missing" and "non-missing" cases, which require entirely different logic: picking a random missing shard, or picking the oldest unleased shard, respectively. Also change the UpdateCRLShard method to "unlease" shards when they're updated. This allows the crl-updater to run as quickly as it likes, while still ensuring that multiple instances do not step on each other's toes. The config change for shardWidth and lookbackPeriod instead of certificateLifetime has been deployed in prod since IN-8445. The config change changing the shardWidth is just so that the tests neither produce a bazillion shards, nor have to do a bazillion SA queries for each chunk within a shard, improving the readability of test logs. Part of https://github.com/letsencrypt/boulder/issues/7023	2023-08-08 17:05:00 -07:00
Jacob Hoffman-Andrews	38fc840184	sa: refactor how metrics and logging are set up (#7031 ) This eliminates the need for a pair of accessors on `db.WrappedMap` that expose the underlying `sql.DB` and `borp.DbMap`. Fixes #6991	2023-08-08 09:51:23 -07:00
Aaron Gable	9a4f0ca678	Deprecate LeaseCRLShards feature (#7009 ) This feature flag is enabled in both staging and prod.	2023-08-07 15:17:00 -07:00
Jacob Hoffman-Andrews	725f190c01	ca: remove orphan queue code (#7025 ) The `orphanQueueDir` config field is no longer used anywhere. Fixes #6551	2023-08-02 16:04:28 -07:00
Samantha	055f620c4b	Initial implementation of key-value rate limits (#6947 ) This design seeks to reduce read-pressure on our DB by moving rate limit tabulation to a key-value datastore. This PR provides the following: - (README.md) a short guide to the schemas, formats, and concepts introduced in this PR - (source.go) an interface for storing, retrieving, and resetting a subscriber bucket - (name.go) an enumeration of all defined rate limits - (limit.go) a schema for defining default limits and per-subscriber overrides - (limiter.go) a high-level API for interacting with key-value rate limits - (gcra.go) an implementation of the Generic Cell Rate Algorithm, a leaky bucket-style scheduling algorithm, used to calculate the present or future capacity of a subscriber bucket using spend and refund operations Note: the included source implementation is test-only and currently accomplished using a simple in-memory map protected by a mutex, implementations using Redis and potentially other data stores will follow. Part of #5545	2023-07-21 12:57:18 -04:00
Aaron Gable	908421bb98	crl-updater: lease CRL shards to prevent races (#6941 ) Add a new feature flag, LeaseCRLShards, which controls certain aspects of crl-updater's behavior. When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard method before beginning work on a shard. This prevents it from stepping on the toes of another crl-updater instance which may be working on the same shard. This is important to prevent two competing instances from accidentally updating a CRL's Number (which is an integer representation of its thisUpdate timestamp) backwards, which would be a compliance violation. When this flag is enabled, crl-updater also calls the new SA.UpdateCRLShard method after finishing work on a shard. In the future, additional work will be done to make crl-updater use the "give me the oldest available shard" mode of the LeaseCRLShard method. Fixes https://github.com/letsencrypt/boulder/issues/6897	2023-07-19 15:11:16 -07:00
Jacob Hoffman-Andrews	7d66d67054	It's borpin' time! (#6982 ) This change replaces [gorp] with [borp]. The changes consist of a mass renaming of the import and comments / doc fixups, plus modifications of many call sites to provide a context.Context everywhere, since gorp newly requires this (this was one of the motivating factors for the borp fork). This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and `github.com/letsencrypt/boulder/db.Transaction` to not embed their underlying gorp/borp objects, but to have them as plain fields. This ensures that we can only call methods on them that are specifically implemented in `github.com/letsencrypt/boulder/db`, so we don't miss wrapping any. This required introducing a `NewWrappedMap` method along with accessors `SQLDb()` and `BorpDB()` to get at the internal fields during metrics and logging setup. Fixes #6944	2023-07-17 14:38:29 -07:00
Aaron Gable	bd29cc430f	Allow reading incident rows with NULL columns (#6961 ) Fixes https://github.com/letsencrypt/boulder/issues/6960	2023-06-30 08:29:16 -07:00
Aaron Gable	3d80d8505e	SA: gRPC methods for leasing CRL shards (#6940 ) Add two new methods, LeaseCRLShard and UpdateCRLShard, to the SA gRPC interface. These methods work in concert both to prevent multiple instances of crl-updater from stepping on each others toes, and to lay the groundwork for a less bursty version of crl-updater in the future. Introduce a new database table, crlShards, which tracks the thisUpdate and nextUpdate timestamps of each CRL shard for each issuer. It also has a column "leasedUntil", which is also a timestamp. Grant the SA user read-write access to this table. LeaseCRLShard updates the leasedUntil column of the identified shard to the given time. It returns an error if the identified shard's leasedUntil timestamp is already in the future. This provides a mechanism for crl-updater instances to "lick the cookie", so to speak, marking CRL shards as "taken" so that multiple crl-updater instances don't attempt to work on the same shard at the same time. Using a timestamp has the added benefit that leases are guaranteed to expire, ensuring that we don't accidentally fail to work on a shard forever. LeaseCRLShard has a second mode of operation, when a range of potential shards is given in the request, rather than a single shard. In this mode, it returns the shard (within the given range) whose thisUpdate timestamp is oldest. (Shards with no thisUpdate timestamp, including because the requested range includes shard indices the database doesn't yet know about, count as older than any shard with any thisUpdate timestamp.) This allows crl-updater instances which don't care which shard they're working on to do the most urgent work first. UpdateCRLShard updates the thisUpdate and nextUpdate timestamps of the identified shard. This closes the loop with the second mode of LeaseCRLShard above: by updating the thisUpdate timestamp, the method marks the shard as no longer urgently needing to be worked on. IN-9220 tracks creating this table in staging and production Part of #6897	2023-06-26 15:39:13 -07:00
Jacob Hoffman-Andrews	824417f6c0	sa: refactor db initialization (#6930 ) Previously, we had three chained calls initializing a database: - InitWrappedDb calls NewDbMap - NewDbMap calls NewDbMapFromConfig Since all three are exporetd, this left me wondering when to call one vs the others. It turns out that NewDbMap is only called from tests, so I renamed it to DBMapForTest to make that clear. NewDbMapFromConfig is only called internally to the SA, so I made it unexported it as newDbMapFromMysqlConfig. Also, I copied the ParseDSN call into InitWrappedDb, so it doesn't need to call DBMapForTest. Now InitWrappedDb and DBMapForTest both independently call newDbMapFromMysqlConfig. I also noticed that InitDBMetrics was only called internally so I unexported it.	2023-06-13 10:15:40 -07:00
Samantha	124c4cc6f5	grpc/sa: Implement deep health checks (#6928 ) Add the necessary scaffolding for deep health checking of our various gRPC components. Each component implementation that also implements the grpc.checker interface will be checked periodically, and the health status of the component will be updated accordingly. Add the necessary methods to SA to implement the grpc.checker interface and register these new health checks with Consul. Additionally: - Update entry point script to check for ProxySQL readiness. - Increase the poll rate for gRPC Consul checks from 5s to 2s to help with DNS failures, due to check failures, on startup. - Change log level for Consul from INFO to ERROR to deal with noisy logs full of transport failures due to Consul gRPC checks firing before the SAs are up. Fixes #6878 Part of #6795	2023-06-12 13:58:53 -04:00
Jacob Hoffman-Andrews	80e1510819	admin: add clear-email subcommand (#6919 ) When a user wants their email address deleted from the database but no longer has access to their account, this allows an administrator to clear it. This adds `admin` as an alias for `admin-revoker`, because we'd like the clear-email sub-command to be a part of that overall tool, but it's not really revocation related. Part of #6864	2023-06-01 14:33:24 -04:00
Samantha	e72a8f9cac	docker: Update proxysql container to match production (#6914 )	2023-05-31 11:31:10 -04:00
Jacob Hoffman-Andrews	b9eeb6ce1c	sa/database: move unmoored comment (#6922 ) This comment about STRICT_ALL_TABLES got separated from the code it documented. Bring them back together.	2023-05-30 09:15:06 -07:00
Phil Porada	c75bf7033a	SA: Don't store HTTP-01 hostname and port in database validationrecord (#6863 ) Removes the `Hostname` and `Port` fields from an http-01 ValidationRecord model prior to storing the record in the database. Using `"hostname":"example.com","port":"80"` as a snippet of a whole validation record, we'll save minimum 36 bytes for each new http-01 ValidationRecord that gets stored. When retrieving the record, the ValidationRecord `RehydrateHostPort` method will repopulate the `Hostname` and `Port` fields from the `URL` field. Fixes the main goal of https://github.com/letsencrypt/boulder/issues/5231. --------- Co-authored-by: Samantha <hello@entropy.cat>	2023-05-23 15:36:17 -04:00
Aaron Gable	56f8537e68	Ensure SelectOne queries never return more than 1 row (#6900 ) As a follow-up to https://github.com/letsencrypt/boulder/issues/5467, I did an audit of all places where we call SelectOne to ensure that those queries can never return more than one result. These four functions were the only places that weren't already constrained to a single result through the use of "SELECT COUNT", "LIMIT 1", "WHERE uniqueKey =", or similar. Limit these functions' queries to always only return a single result, now that their underlying tables no longer have unique key constraints. Additionally, slightly refactor selectRegistration to just take a single column name rather than a whole WHERE clause. Fixes https://github.com/letsencrypt/boulder/issues/6521	2023-05-17 14:13:21 -07:00
Matthew McPherrin	8c9c55609b	Remove redundant jose import alias (#6887 ) This PR should have no functional change; just a cleanup.	2023-05-15 09:45:58 -07:00
Aaron Gable	1fcd951622	Probs: simplifications and cleanup (#6876 ) Make minor, non-user-visible changes to how we structure the probs package. Notably: - Add new problem types for UnsupportedContact and UnsupportedIdentifier, which are specified by RFC8555 and which we will use in the future, but haven't been using historically. - Sort the problem types and constructor functions to match the (alphabetical) order given in RFC8555. - Rename some of the constructor functions to better match their underlying problem types (e.g. "TLSError" to just "TLS"). - Replace the redundant ProblemDetailsToStatusCode function with simply always returning a 500 if we haven't properly set the problem's HTTPStatus. - Remove the ability to use either the V1 or V2 error namespace prefix; always use the proper RFC namespace prefix.	2023-05-12 12:10:13 -04:00
Jacob Hoffman-Andrews	1c7e0fd1d8	Store linting certificate instead of precertificate (#6807 ) In order to get rid of the orphan queue, we want to make sure that before we sign a precertificate, we have enough data in the database that we can fulfill our revocation-checking obligations even if storing that precertificate in the database fails. That means: - We should have a row in the certificateStatus table for the serial. - But we should not serve "good" for that serial until we are positive the precertificate was issued (BRs 4.9.10). - We should have a record in the live DB of the proposed certificate's public key, so the bad-key-revoker can mark it revoked. - We should have a record in the live DB of the proposed certificate's names, so it can be revoked if we are required to revoke based on names. The SA.AddPrecertificate method already achieves these goals for precertificates by writing to the various metadata tables. This PR repurposes the SA.AddPrecertificate method to write "proposed precertificates" instead. We already create a linting certificate before the precertificate, and that linting certificate is identical to the precertificate that will be issued except for the private key used to sign it (and the AKID). So for instance it contains the right pubkey and SANs, and the Issuer name is the same as the Issuer name that will be used. So we'll use the linting certificate as the "proposed precertificate" and store it to the DB, along with appropriate metadata. In the new code path, rather than writing "good" for the new certificateStatus row, we write a new, fake OCSP status string "wait". This will cause us to return internalServerError to OCSP requests for that serial (but we won't get such requests because the serial has not yet been published). After we finish precertificate issuance, we update the status to "good" with SA.SetCertificateStatusReady. Part of #6665	2023-04-26 13:54:24 -07:00
Aaron Gable	97aa50977f	Give orderToAuthz2 an auto-increment ID column (#6835 ) Replace the current orderToAuthz2 table schema with one that includes an auto-increment ID column, so that this table can be partitioned simply by ID, like all of our other partitioned tables. Update the SA so that when it selects from a join over this table and the authz2 table, it explicitly selects the columns from the authz2 table, to avoid the ambiguity introduced by having two columns named "id" in the result set. This work is already in-progress in prod, represented by IN-8916 and IN-8928. Fixes https://github.com/letsencrypt/boulder/issues/6820	2023-04-24 14:59:18 -07:00
Aaron Gable	5480f1060b	Clean up database schema (#6832 ) Make a series of small changes to our test database schema, both to make it simpler to reason about and to bring it closer in alignment to our production database schema: - Incorporate the IssuedNamesDropIndex, Incidents, SimplePartitioning, and NotUnique migrations into the CombinedSchema, as they have been fully applied in prod; - Use CHARSET=utf8mb4 everywhere, instead of just utf8; - Use UNSIGNED for auto-increment ID columns in the tables where prod does; and - Re-sort the tables in CombinedSchema which no longer have foreign key constraints. Part of https://github.com/letsencrypt/boulder/issues/6820	2023-04-21 10:37:05 -07:00
Phil Porada	939a14544c	SA: Check MariaDB system variables at startup (#6791 ) Adds a new function to the `//sa` to ensure that a MariaDB config passed in via SA `setDefault` or via DSN perform the following validations: 1. Correct quoting for strings and string enums to prevent future problems such as PR #6683 from occurring. 2. Each system variable we care to use is scoped as SESSION, rather than strictly GLOBAL. 3. Detect system variables passed in that are not in a curated list of variables we care about. 4. Validate that values for booleans, floats, integers, and strings at least pass basic a regex. This change is in a bit of a weird place. The ideal place for this change would be `go-sql-driver/mysql`, but since that driver handles the general case of MySQL-compatible connections to the database, we're implementing this validation in Boulder instead. We're confident about the specific versions of MariaDB running in staging/prod and that the database vendor won't change underneath us, which is why I decided to take this approach. However, this change will bind us tighter to MariaDB than MySQL due to the specific variables we're checking. An up-to-date list of MariaDB system variables can be found [here.](https://mariadb.com/kb/en/server-system-variables/) Fixes https://github.com/letsencrypt/boulder/issues/6687.	2023-04-18 11:02:33 -04:00
Aaron Gable	1235cbed5e	Re-remove never-used crls table (#6817 ) Relands #5303, which was accidentally reverted in #5305. Fixes https://github.com/letsencrypt/boulder/issues/6816	2023-04-17 16:00:17 -07:00
Aaron Gable	45329c9472	Deprecate ROCSPStage7 flag (#6804 ) Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to stop generating OCSP responses when issuing new certs and when revoking certs. (That functionality is now handled just-in-time by the ocsp-responder.) Delete the old OCSP-generating codepaths from the RA and CA. Remove the CA's internal reference to an OCSP implementation, because it no longer needs it. Additionally, remove the SA's "Issuers" config field, which was never used. Fixes #6285	2023-04-12 17:03:06 -07:00
Aaron Gable	7e994a1216	Deprecate ROCSPStage6 feature flag (#6770 ) Deprecate the ROCSPStage6 feature flag. Remove all references to the `ocspResponse` column from the SA, both when reading from and when writing to the `certificateStatus` table. This makes it safe to fully remove that column from the database. IN-8731 enabled this flag in all environments, so it is safe to deprecate. Part of #6285	2023-04-04 15:41:51 -07:00
Aaron Gable	8c67769be4	Remove ocsp-updater from Boulder (#6769 ) Delete the ocsp-updater service, and the //ocsp/updater library that supports it. Remove test configs for the service, and remove references to the service from other test files. This service has been fully shut down for an extended period now, and is safe to remove. Fixes #6499	2023-03-31 14:39:04 -07:00
Aaron Gable	9262ca6e3f	Add grpc implementation tests to all services (#6782 ) As a follow-up to #6780, add the same style of implementation test to all of our other gRPC services. This was not included in that PR just to keep it small and single-purpose.	2023-03-31 09:52:26 -07:00
Aaron Gable	27f0860aed	Remove precertificates.go (#6783 ) This file contained both read-only and read-write methods. Its existence is not reflected in any other gRPC or struct organization; it was easy to forget that it exists. Merge its contents into both sa.go and saro.go, so that the methods follow the same organization scheme as the rest of the SA. This makes it less likely that bugs like #6778 will happen again.	2023-03-30 17:59:11 -04:00
Aaron Gable	0d0116dd3f	Implement GetSerialMetadata on StorageAuthorityRO (#6780 ) When external clients make POST requests to our ARI endpoint, they're getting 404s even when a GET request with the same exact CertID succeeds. Logs show that this is because the SA is returning "method GetSerialMetadata not implemented" when the WFE attempts that gRPC request. This is due to an oversight: the GetSerialMetadata method is not implemented on the SQLStorageAuthorityRO object, only on the SQLStorageAuthority object. The unit tests did not catch this bug because they supply a mock SA, which does implement the method in question. Update the receiver and add a wrapper so that GetSerialMetadata is implemented on both the read-write and read-only SA implementation types. Add a new kind of test assertion which helps ensure this won't happen again. Add a TODO for an integration test covering the ARI POST codepath to prevent a regression. Fixes #6778	2023-03-30 12:32:14 -07:00
Phil Porada	ce2ee69c5f	SARO: Add sa_lag_factor metric to assess usage of the lagFactor codepath (#6774 ) Add `sa_lag_retry` prometheus countervec metric with pass/fail dimensions for `GetOrder`, `GetAuthorization2`, and `GetRegistration` methods. The new metrics will appear as follows: ``` sa_lag_retry{method="GetOrder",result="found"} 0 sa_lag_retry{method="GetOrder",result="notfound"} 0 sa_lag_retry{method="GetOrder",result="other"} 0 sa_lag_retry{method="GetAuthorization2",result="found"} 0 sa_lag_retry{method="GetAuthorization2",result="notfound"} 0 sa_lag_retry{method="GetAuthorization2",result="other"} 0 sa_lag_retry{method="GetRegistration",result="found"} 0 sa_lag_retry{method="GetRegistration",result="notfound"} 0 sa_lag_retry{method="GetRegistration",result="other"} 0 ``` Fixes https://github.com/letsencrypt/boulder/issues/6773 --------- Co-authored-by: Samantha <hello@entropy.cat>	2023-03-30 13:48:16 -04:00
Samantha	511f5b79f1	test: Add ProxySQL to our Docker development stack (#6754 ) Add an upstream ProxySQL container to our docker-compose. Configure ProxySQL to manage database connections for our unit and integration tests. Fixes #5873	2023-03-29 18:41:24 -04:00
Jacob Hoffman-Andrews	85fd3ed8b7	sa: remove GetPrecertificate (#6692 ) This was mostly unused. The only caller was orphan-finder, which used it to determine if a certificate was already in the database. But this is not particularly important functionality, so I've removed it.	2023-03-01 11:30:51 -08:00
Jacob Hoffman-Andrews	d9872dbe41	sa: rename AddPrecertificateRequest.IssuerID (#6689 ) sa: rename AddPrecertificateRequest.IssuerID to IssuerNameID. This is in preparation for adding a similarly-named field to AddSerialRequest. Part of #5152.	2023-02-27 17:21:00 -05:00
Aaron Gable	5ce4b5a6d4	Use time format constants (#6694 ) Use constants from the go stdlib time package, such as time.DateTime and time.RFC3339, when parsing and formatting timestamps. Additionally, simplify or remove some of our uses of parsing timestamps, such as to set fake clocks in tests.	2023-02-24 11:22:23 -08:00
Jacob Hoffman-Andrews	8fd5861c1f	sa: quote sql_mode (#6683 ) When sql_mode is set as part of a multi-variable SET command (which happens in go-sql-driver/mysql 1.6.0+), ProxySQL can mis-parse parts of the SET command that come after it. For instance, if we run: SET sql_mode=STRICT_ALL_TABLES,log_queries_not_using_indexes=ON; Then ProxySQL would mis-parse that and pass along to its upstream: SET sql_mode=STRICT_ALL_TABLES,log_queries_not_using_indexes; Adding quotes around sql_mode (a string-valued variables) causes ProxySQL to parse this correctly.	2023-02-22 16:30:04 -05:00
Aaron Gable	f9e4fb6c06	Add replication lag retries to some SA methods (#6649 ) Add a new time.Duration field, LagFactor, to both the SA's config struct and the read-only SA's implementation struct. In the GetRegistration, GetOrder, and GetAuthorization2 methods, if the database select returned a NoRows error and a lagFactor duration is configured, then sleep for lagFactor seconds and retry the select. This allows us to compensate for the replication lag between our primary write database and our read-only replica databases. Sometimes clients will fire requests in rapid succession (such as creating a new order, then immediately querying the authorizations associated with that order), and the subsequent requests will fail because they are directed to read replicas which are lagging behind the primary. Adding this simple sleep-and-retry will let us mitigate many of these failures, without adding too much complexity. Fixes #6593	2023-02-14 17:25:13 -08:00
Jacob Hoffman-Andrews	e57c788086	Add checking of validations to cert-checker (#6617 ) This includes two feature flags: one that controls turning on the extra database queries, and one that causes cert-checker to fail on missing validations. If the second flag isn't turned on, it will just emit error log lines. This will help us find any edge conditions we need to deal with before making the new code trigger alerts. Fixes #6562	2023-02-03 16:25:41 -05:00

1 2 3 4 5 ...

719 Commits