boulder

Commit Graph

Author	SHA1	Message	Date
Samantha Frank	428fcb30de	ARI: Store and reflect optional "replaces" value for Orders (#8056 ) - Plumb the "replaces" value from the WFE through to the SA via the RA - Store validated "replaces" value for new orders in the orders table - Reflect the stored "replaces" value to subscribers in the order object - Reorder CertificateProfileName before Replaces/ReplacesSerial in RA and SA protos for consistency Fixes #8034	2025-03-12 15:09:29 -04:00
Aaron Gable	d9433fe293	Remove 'RETURNING' functionality from MultiInserter (#7740 ) Deprecate the "InsertAuthzsIndividually" feature flag, which has been set to true in both Staging and Production. Delete the code guarded behind that flag being false, namely the ability of the MultiInserter to return the newly-created IDs from all of the rows it has inserted. This behavior is being removed because it is not supported in MySQL / Vitess. Fixes https://github.com/letsencrypt/boulder/issues/7718 --- > [!WARNING] > ~~Do not merge until IN-10737 is complete~~	2025-02-19 14:37:22 -08:00
Aaron Gable	6b1e7f04e8	SA: Clean up pre-profile order schema and feature flag (#7953 ) Deprecate the MultipleCertificateProfiles feature flag, which has been enabled in both Staging and Prod. Delete all code protected by that flag being false, namely the orderModelv1 type and its support code. Update the config schema to match the config-next schema. Fixes https://github.com/letsencrypt/boulder/issues/7324 Fixes https://github.com/letsencrypt/boulder/issues/7408	2025-01-17 17:15:01 -08:00
Matthew McPherrin	ace233cbdc	Update admin-revoker certs to be admin (#7947 ) The admin and admin-revoker tools shared certs. admin-revoker is gone, so update the certs to use the admin name only.	2025-01-17 16:02:20 -05:00
James Renken	e4668b4ca7	Deprecate DisableLegacyLimitWrites & UseKvLimitsForNewOrder flags; remove code using certificatesPerName & newOrdersRL tables (#7858 ) Remove code using `certificatesPerName` & `newOrdersRL` tables. Deprecate `DisableLegacyLimitWrites` & `UseKvLimitsForNewOrder` flags. Remove legacy `ratelimit` package. Delete these RA test cases: - `TestAuthzFailedRateLimitingNewOrder` (rl: `FailedAuthorizationsPerDomainPerAccount`) - `TestCheckCertificatesPerNameLimit` (rl: `CertificatesPerDomain`) - `TestCheckExactCertificateLimit` (rl: `CertificatesPerFQDNSet`) - `TestExactPublicSuffixCertLimit` (rl: `CertificatesPerDomain`) Rate limits in NewOrder are now enforced by the WFE, starting here: `5a9b4c4b18/wfe2/wfe.go (L781)` We collect a batch of transactions to check limits, check them all at once, go through and find which one(s) failed, and serve the failure with the Retry-After that's furthest in the future. All this code doesn't really need to be tested again; what needs to be tested is that we're returning the correct failure. That code is `NewOrderLimitTransactions`, and the `ratelimits` package's tests cover this. The public suffix handling behavior is tested by `TestFQDNsToETLDsPlusOne`: `5a9b4c4b18/ratelimits/utilities_test.go (L9)` Some other RA rate limit tests were deleted earlier, in #7869. Part of #7671.	2025-01-10 12:50:57 -08:00
Samantha Frank	6c85b8d019	wfe/sa/features: Deprecate TrackReplacementCertificatesARI (#7766 )	2024-10-24 13:38:33 -04:00
Aaron Gable	7b032a663f	Add feature flag to remove use of "INSERT RETURNING" in NewOrderAndAuthzs (#7739 ) This is our only use of MariaDB's "INSERT ... RETURNING" syntax, which does not exist in MySQL and Vitess. Add a feature flag which removes our use of this feature, so that we can easily disable it and then re-enable it if it turns out to be too much of a performance hit. Also add a benchmark showing that the serial-insertion approach is slower, but perhaps not debilitatingly so. Part of https://github.com/letsencrypt/boulder/issues/7718	2024-10-04 14:56:44 -07:00
Samantha Frank	2fa9fbcd23	SA: Add feature flag DisableLegacyLimitWrites (#7728 )	2024-09-30 14:09:40 -04:00
Phil Porada	30c6e592f7	sfe: Implement self-service frontend for account pausing/unpausing (#7500 ) Adds a new boulder component named `sfe` aka the Self-service FrontEnd which is dedicated to non-ACME related Subscriber functions. This change implements one such function which is a web interface and handlers for account unpausing. When paused, an ACME client receives a log line URL with a JWT parameter from the WFE. For the observant Subscriber, manually clicking the link opens their web browser and displays a page with a pre-filled HTML form. Upon clicking the form button, the SFE sends an HTTP POST back to itself and either validates the JWT and issues an RA gRPC request to unpause the account, or returns an HTML error page. The SFE and WFE should share a 32 byte seed value e.g. the output of `openssl rand -hex 16` which will be used as a go-jose symmetric signer using the HS256 algorithm. The SFE will check various [RFC 7519](https://datatracker.ietf.org/doc/html/rfc7519) claims on the JWT such as the `iss`, `aud`, `nbf`, `exp`, `iat`, and a custom `apiVersion` claim. The SFE should not yet be relied upon or deployed to staging/production environments. It is very much a work in progress, but this change is big enough as-is. Related to https://github.com/letsencrypt/boulder/issues/7406 Part of https://github.com/letsencrypt/boulder/issues/7499	2024-07-10 10:52:33 -04:00
Aaron Gable	6ae6aa8e90	Dynamically generate grpc-creds at integration test startup (#7477 ) The summary here is: - Move test/cert-ceremonies to test/certs - Move .hierarchy (generated by the above) to test/certs/webpki - Remove our mapping of .hierarchy to /hierarchy inside docker - Move test/grpc-creds to test/certs/ipki - Unify the generation of both test/certs/webpki and test/certs/ipki into a single script at test/certs/generate.sh - Make that script the entrypoint of a new docker compose service - Have t.sh and tn.sh invoke that service to ensure keys and certs are created before tests run No production changes are necessary, the config changes here are just for testing purposes. Part of https://github.com/letsencrypt/boulder/issues/7476	2024-05-15 11:31:23 -04:00
Phil Porada	8556eaedca	SA: store and return certificate profile name (#7352 ) Adds `certificateProfileName` to the `orders` database table. The [maximum length](https://github.com/letsencrypt/boulder/pull/7325/files#diff-a64a0af7cbf484da8e6d08d3eefdeef9314c5d9888233f0adcecd21b800102acR35) of a profile name matches the `//issuance` package. Adds a `MultipleCertificateProfiles` feature flag that, when enabled, will store the certificate profile name from a `NewOrderRequest`. The certificate profile name is allowed to be empty and the database will treat that row as [NULL](https://mariadb.com/kb/en/null-values/). When the SA retrieves this potentially NULL row, it will be cast as the golang string zero value `""`. SRE ticket IN-10145 has been filed to perform the database migration and enable the new feature flag. The migration must be performed before enabling the feature flag. Part of https://github.com/letsencrypt/boulder/issues/7324	2024-03-20 13:08:31 -04:00
Samantha	f10abd27eb	SA/ARI: Add method of tracking certificate replacement (#7284 ) Part of #6732 Part of #7038	2024-02-08 14:19:29 -05:00
Aaron Gable	10e894a172	Create new admin tool (#7276 ) Create a new administration tool "bin/admin" as a successor to and replacement of "admin-revoker". This new tool supports all the same fundamental capabilities as the old admin-revoker, including: - Revoking by serial, by batch of serials, by incident table, and by private key - Blocking a key to let bad-key-revoker take care of revocation - Clearing email addresses from all accounts that use them Improvements over the old admin-revoker include: - All commands run in "dry-run" mode by default, to prevent accidental executions - All revocation mechanisms allow setting the revocation reason, skipping blocking the key, indicating that the certificate is malformed, and controlling the number of parallel workers conducting revocation - All revocation mechanisms do not parse the cert in question, leaving that to the RA - Autogenerated usage information for all subcommands - A much more modular structure to simplify adding more capabilities in the future - Significantly simplified tests with smaller mocks The new tool has analogues of all of admin-revokers unit tests, and all integration tests have been updated to use the new tool instead. A future PR will remove admin-revoker, once we're sure SRE has had time to update all of their playbooks. Fixes https://github.com/letsencrypt/boulder/issues/7135 Fixes https://github.com/letsencrypt/boulder/issues/7269 Fixes https://github.com/letsencrypt/boulder/issues/7268 Fixes https://github.com/letsencrypt/boulder/issues/6927 Part of https://github.com/letsencrypt/boulder/issues/6840	2024-02-07 09:35:18 -08:00
Aaron Gable	97cba52e09	Remove deprecated and unused feature flags (#7207 ) These feature flags are no longer referenced in any test, staging, or production configuration. They were removed in: - StoreRevokerInfo: IN-8546 - ROCSPStage6 and ROCSPStage7: IN-8886 - CAAValidationMethods and CAAAccountURI: IN-9301	2023-12-13 13:53:31 -08:00
Matthew McPherrin	cb5384dcd7	Add --addr and/or --debug-addr flags to all commands (#7175 ) Many services already have --addr and/or --debug-addr flags. However, it wasn't universal, so this PR adds flags to commands where they're not currently present. This makes it easier to use a shared config file but listen on different ports, for running multiple instances on a single host. The config options are made optional as well, and removed from config-next/.	2023-12-07 17:41:01 -08:00
Jacob Hoffman-Andrews	725f190c01	ca: remove orphan queue code (#7025 ) The `orphanQueueDir` config field is no longer used anywhere. Fixes #6551	2023-08-02 16:04:28 -07:00
Aaron Gable	908421bb98	crl-updater: lease CRL shards to prevent races (#6941 ) Add a new feature flag, LeaseCRLShards, which controls certain aspects of crl-updater's behavior. When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard method before beginning work on a shard. This prevents it from stepping on the toes of another crl-updater instance which may be working on the same shard. This is important to prevent two competing instances from accidentally updating a CRL's Number (which is an integer representation of its thisUpdate timestamp) backwards, which would be a compliance violation. When this flag is enabled, crl-updater also calls the new SA.UpdateCRLShard method after finishing work on a shard. In the future, additional work will be done to make crl-updater use the "give me the oldest available shard" mode of the LeaseCRLShard method. Fixes https://github.com/letsencrypt/boulder/issues/6897	2023-07-19 15:11:16 -07:00
Samantha	124c4cc6f5	grpc/sa: Implement deep health checks (#6928 ) Add the necessary scaffolding for deep health checking of our various gRPC components. Each component implementation that also implements the grpc.checker interface will be checked periodically, and the health status of the component will be updated accordingly. Add the necessary methods to SA to implement the grpc.checker interface and register these new health checks with Consul. Additionally: - Update entry point script to check for ProxySQL readiness. - Increase the poll rate for gRPC Consul checks from 5s to 2s to help with DNS failures, due to check failures, on startup. - Change log level for Consul from INFO to ERROR to deal with noisy logs full of transport failures due to Consul gRPC checks firing before the SAs are up. Fixes #6878 Part of #6795	2023-06-12 13:58:53 -04:00
Samantha	f09a94bd74	consul: Configure gRPC health check for SA (#6908 ) Enable SA gRPC health checks in Consul ahead of further changes for #6878. Calls to the `Check` method of the SA's grpc.health.v1.Health service must respond `SERVING` before the `sa` service will be advertised in Consul DNS. Consul will continue to poll this service every 5 seconds. - Add `bconsul` docker service to boulder `bluenet` and `rednet` - Add TLS credentials for `consul.boulder`: ```shell $ openssl x509 -in consul.boulder/cert.pem -text \| grep DNS DNS:consul.boulder ``` - Update `test/grpc-creds/generate.sh` to add `consul.boulder` - Update test SA configs to allow `consul.boulder` to access to `grpc.health.v1.Health` Part of #6878	2023-05-23 13:16:49 -04:00
Matthew McPherrin	b7d9f8c2e3	In config-next/, opentelemetry -> openTelemetry for consistency (#6888 ) In configs, opentelemetry -> openTelemetry As pointed out in review of #6867, these should match the case of their corresponding Go identifiers for consistency. JSON keys are case-insensitive in Go (part of why we've got a fork in go-jose), so this change should have no functional impact.	2023-05-15 17:07:29 -04:00
Matthew McPherrin	8427245675	OTel Integration test using jaeger (#6842 ) This adds Jaeger's all-in-one dev container (with no persistent storage) to boulder's dev docker-compose. It configures config-next/ to send all traces there. A new integration test creates an account and issues a cert, then verifies the trace contains some set of expected spans. This test found that async finalize broke spans, so I fixed that and a few related spots where we make a new context.	2023-05-05 10:41:29 -04:00
Aaron Gable	45329c9472	Deprecate ROCSPStage7 flag (#6804 ) Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to stop generating OCSP responses when issuing new certs and when revoking certs. (That functionality is now handled just-in-time by the ocsp-responder.) Delete the old OCSP-generating codepaths from the RA and CA. Remove the CA's internal reference to an OCSP implementation, because it no longer needs it. Additionally, remove the SA's "Issuers" config field, which was never used. Fixes #6285	2023-04-12 17:03:06 -07:00
Aaron Gable	7e994a1216	Deprecate ROCSPStage6 feature flag (#6770 ) Deprecate the ROCSPStage6 feature flag. Remove all references to the `ocspResponse` column from the SA, both when reading from and when writing to the `certificateStatus` table. This makes it safe to fully remove that column from the database. IN-8731 enabled this flag in all environments, so it is safe to deprecate. Part of #6285	2023-04-04 15:41:51 -07:00
Matthew McPherrin	05c9106eba	lints: Consistently format JSON configuration files (#6755 ) - Consistently format existing test JSON config files - Add a small Python script which loads and dumps JSON files - Add CI JSON lint test to CI --------- Co-authored-by: Aaron Gable <aaron@aarongable.com>	2023-03-20 18:11:19 -04:00
Matthew McPherrin	e1ed1a2ac2	Remove beeline tracing (#6733 ) Remove tracing using Beeline from Boulder. The only remnant left behind is the deprecated configuration, to ensure deployability. We had previously planned to swap in OpenTelemetry in a single PR, but that adds significant churn in a single change, so we're doing this as multiple steps that will each be significantly easier to reason about and review. Part of #6361	2023-03-14 15:14:27 -07:00
Samantha	a0fe7dc93e	SA: Remove Redis config (#6695 ) This field doesn't appear to be in use. Part of #6052	2023-02-27 09:29:38 -08:00
Aaron Gable	f9e4fb6c06	Add replication lag retries to some SA methods (#6649 ) Add a new time.Duration field, LagFactor, to both the SA's config struct and the read-only SA's implementation struct. In the GetRegistration, GetOrder, and GetAuthorization2 methods, if the database select returned a NoRows error and a lagFactor duration is configured, then sleep for lagFactor seconds and retry the select. This allows us to compensate for the replication lag between our primary write database and our read-only replica databases. Sometimes clients will fire requests in rapid succession (such as creating a new order, then immediately querying the authorizations associated with that order), and the subsequent requests will fail because they are directed to read replicas which are lagging behind the primary. Adding this simple sleep-and-retry will let us mitigate many of these failures, without adding too much complexity. Fixes #6593	2023-02-14 17:25:13 -08:00
Samantha	5c49231ea6	ROCSP: Remove support for Redis Cluster (#6645 ) Fixes #6517	2023-02-09 17:14:37 -05:00
Samantha	6c6da76400	ROCSP: Replace Redis Cluster with a consistently sharded all-primary nodes (#6516 )	2022-12-19 15:06:47 -05:00
Aaron Gable	ba34ac6b6e	Use read-only SA clients in wfe, ocsp, and crl (#6484 ) In the WFE, ocsp-responder, and crl-updater, switch from using StorageAuthorityClients to StorageAuthorityReadOnlyClients. This ensures that these services cannot call methods which write to our database. Fixes #6454	2022-12-02 13:48:28 -08:00
Aaron Gable	4f473edfa8	Deprecate 10 feature flags (#6502 ) Deprecate these feature flags, which are consistently set in both prod and staging and which we do not expect to change the value of ever again: - AllowReRevocation - AllowV1Registration - CheckFailedAuthorizationsFirst - FasterNewOrdersRateLimit - GetAuthzReadOnly - GetAuthzUseIndex - MozRevocationReasons - RejectDuplicateCSRExtensions - RestrictRSAKeySizes - SHA1CSRs Move each feature flag to the "deprecated" section of features.go. Remove all references to these feature flags from Boulder application code, and make the code they were guarding the only path. Deduplicate tests which were testing both the feature-enabled and feature-disabled code paths. Remove the flags from all config-next JSON configs (but leave them in config ones until they're fully deleted, not just deprecated). Finally, replace a few testdata CSRs used in CA tests, because they had SHA1WithRSAEncryption signatures that are now rejected. Fixes #5171 Fixes #6476 Part of #5997	2022-11-14 09:24:50 -08:00
Aaron Gable	9e67423110	Create new StorageAuthorityReadOnly gRPC service (#6483 ) Create a new gRPC service named StorageAuthorityReadOnly which only exposes a read-only subset of the existing StorageAuthority service's methods. Implement this by splitting the existing SA in half, and having the read-write half embed and wrap an instance of the read-only half. Unfortunately, many of our tests use exported read-write methods as part of their test setup, so the tests are all being performed against the read-write struct, but they are exercising the same code as the read-only implementation exposes. Expose this new service at the SA on the same port as the existing service, but with (in config-next) different sets of allowed clients. In the future, read-only clients will be removed from the read-write service's set of allowed clients. Part of #6454	2022-11-09 11:09:12 -08:00
Aaron Gable	46c8d66c31	bgrpc.NewServer: support multiple services (#6487 ) Turn bgrpc.NewServer into a builder-pattern, with a config-based initialization, multiple calls to Add to add new gRPC services, and a final call to Build to produce the start() and stop() functions which control server behavior. All calls are chainable to produce compact code in each component's main() function. This improves the process of creating a new gRPC server in three ways: 1) It avoids the need for generics/templating, which was slightly verbose. 2) It allows the set of services to be registered on this server to be known ahead of time. 3) It greatly streamlines adding multiple services to the same server, which we use today in the VA and will be using soon in the SA and CA. While we're here, add a new per-service config stanza to the GRPCServerConfig, so that individual services on the same server can have their own configuration. For now, only provide a "ClientNames" key, which will be used in a follow-up PR. Part of #6454	2022-11-04 13:26:42 -07:00
Samantha	9c12e58c7b	grpc: Allow static host override in client config (#6423 ) - Add a new gRPC client config field which overrides the dNSName checked in the certificate presented by the gRPC server. - Revert all test gRPC credentials to `<service>.boulder` - Revert all ClientNames in gRPC server configs to `<service>.boulder` - Set all gRPC clients in `test/config` to use `serverAddress` + `hostOverride` - Set all gRPC clients in `test/config-next` to use `srvLookup` + `hostOverride` - Rename incorrect SRV record for `ca` with port `9096` to `ca-ocsp` - Rename incorrect SRV record for `ca` with port `9106` to `ca-crl` Resolves #6424	2022-10-03 15:23:55 -07:00
Samantha	90eb90bdbe	test: Replace sd-test-srv with consul (#6389 ) - Add a dedicated Consul container - Replace `sd-test-srv` with Consul - Add documentation for configuring Consul - Re-issue all gRPC credentials for `<service-name>.service.consul` Part of #6111	2022-09-19 16:13:53 -07:00
Jacob Hoffman-Andrews	db044a8822	log: fix spurious honeycomb warnings; improve stdout logger (#6364 ) Honeycomb was emitting logs directly to stderr like this: ``` WARN: Missing API Key. WARN: Dataset is ignored in favor of service name. Data will be sent to service name: boulder ``` Fix this by providing a fake API key and replacing "dataset" with "serviceName" in configs. Also add missing Honeycomb configs for crl-updater. For stdout-only logger, include checksums and escape newlines.	2022-09-14 11:25:02 -07:00
Samantha	78ea1d2c9d	SA: Use separate schema for incidents tables (#6350 ) - Move incidents tables from `boulder_sa` to `incidents_sa` (added in #6344) - Grant read perms for all tables in `incidents_sa` - Modify unit tests to account for new schema and grants - Add database cleaning func for `boulder_sa` - Adjust cleanup funcs to omit `sql-migrate` tables instead of `goose` Resolves #6328	2022-09-09 15:17:14 -07:00
Jacob Hoffman-Andrews	dd1c52573e	log: allow logging to stdout/stderr instead of syslog (#6307 ) Right now, Boulder expects to be able to connect to syslog, and panics if it's not available. We'd like to be able to log to stdout/stderr as a replacement for syslog. - Add a detailed timestamp (down to microseconds, same as we collect in prod via syslog). - Remove the escape codes for colorizing output. - Report the severity level numerically rather than with a letter prefix. Add locking for stdout/stderr and syslog logs. Neither the [syslog] package nor the [os] package document concurrency-safety, and the Go rule is: if it's not documented to be concurrent-safe, it's not. Notably the [log.Logger] package is documented to be concurrent-safe, and a look at its implementation shows it uses a Mutex internally. Remove places that use the singleton `blog.Get()`, and instead pass through a logger from main in all the places that need it. [syslog]: https://pkg.go.dev/log/syslog [os]: https://pkg.go.dev/os [log.Logger]: https://pkg.go.dev/log#Logger	2022-08-29 06:19:22 -07:00
Aaron Gable	09195e6804	ocsp-responder: get minimal status info from SA (#6293 ) Add a new `GetRevocationStatus` gRPC method to the SA which retrieves only the subset of the certificate status metadata relevant to revocation, namely whether the certificate has been revoked, when it was revoked, and the revocation reason. Notably, this method is our first use of the `goog.protobuf.Timestamp` type in a message, which is more ergonomic and less prone to errors than using unix nanoseconds. Use this new method in ocsp-responder's checked_redis_source, to avoid having to send many other pieces of metadata and the full ocsp response bytes over the network. It provides all the information necessary to determine if the response from Redis is up-to-date. Within the checked_redis_source, use this new method in two different ways: if only a database connection is configured (as is the case today) then get this information directly from the db; if a gRPC connection to the SA is available then prefer that instead. This may make requests slower, but will allow us to remove database access from the hosts which run the ocsp-responder today, simplifying our network. The new behavior consists of two pieces, each locked behind a config gate: - Performing the smaller database query is only enabled if the ocsp-responder has the `ROCSPStage3` feature flag enabled. - Talking to the SA rather than the database directly is only enabled if the ocsp-responder has an `saService` gRPC stanza in its config. Fixes #6274	2022-08-16 16:37:24 -07:00
Aaron Gable	3a12177eab	ROCSP Stage 6: Never write OCSP responses to DB (#6284 ) Create a new `ROCSPStage6` feature flag which affects the behavior of the SA. When enabled, this flag causes the `AddPrecertificate`, `RevokeCertificate`, and `UpdateRevokedCertificate` methods to ignore the OCSP response bytes provided by their caller. They will no longer error out if those bytes are missing, and if the bytes are present they will still not be written to the database. This allows us to, in the future, cause the RA and CA to stop generating those OCSP responses entirely, and stop providing them to the SA, without causing any errors when we do. Part of #6079	2022-08-10 15:31:26 -07:00
Aaron Gable	436061fb35	CRL: Create crl-updater service (#6212 ) Create a new service named crl-updater. It is responsible for maintaining the full set of CRLs we issue: one "full and complete" CRL for each currently-active Issuer, split into a number of "shards" which are essentially CRLs with arbitrary scopes. The crl-updater is modeled after the ocsp-updater: it is a long-running standalone service that wakes up periodically, does a large amount of work in parallel, and then sleeps. The period at which it wakes to do work is configurable. Unlike the ocsp-responder, it does all of its work every time it wakes, so we expect to set the update frequency at 6-24 hours. Maintaining CRL scopes is done statelessly. Every certificate belongs to a specific "bucket", given its notAfter date. This mapping is generally unchanging over the life of the certificate, so revoked certificate entries will not be moving between shards upon every update. The only exception is if we change the number of shards, in which case all of the bucket boundaries will be recomputed. For more details, see the comment on `getShardBoundaries`. It uses the new SA.GetRevokedCerts method to collect all of the revoked certificates whose notAfter timestamps fall within the boundaries of each shard's time-bucket. It uses the new CA.GenerateCRL method to sign the CRLs. In the future, it will send signed CRLs to the crl-storer to be persisted outside our infrastructure. Fixes #6163	2022-07-08 09:34:51 -07:00
Andrew Gabbitas	79048cffba	Support writing initial OCSP response to redis (#5958 ) Adds a rocsp redis client to the sa if cluster information is provided in the sa config. If a redis cluster is configured, all new certificate OCSP responses added with sa.AddPrecertificate will attempt to be written to the redis cluster, but will not block or fail on errors. Fixes: #5871	2022-03-21 20:33:12 -06:00
Aaron Gable	c7643992a0	Enable USE INDEX hints when querying authz2 table (#5823 ) Add a new feature flag `GetAuthzUseIndex` which causes the SA to add `USE INDEX (regID_identifer_status_expires_idx)` to its authz2 database queries. This should encourage the query planner to actually use that index instead of falling back to large table-scans. Fixes #5822	2021-12-01 14:48:09 -08:00
Aaron Gable	8eb7272adf	SA: Use read-only connector for GetAuthorizations2 (#5815 ) Add a feature flag which causes the SA to switch between using the traditional read-write database connector (pointed at the primary db) or the newer read-only database connector (usually pointed at a replica) when executing the `GetAuthorizations2` query.	2021-11-24 16:57:42 -08:00
J.C. Jones	7b31bdb30a	Add read-only dbConns to SQLStorageAuthority and OCSPUpdater (#5555 ) This changeset adds a second DB connect string for the SA for use in read-only queries that are not themselves dependencies for read-write queries. In other words, this is attempting to only catch things like rate-limit `SELECT`s and other coarse-counting, so we can potentially move those read queries off the read-write primary database. It also adds a second DB connect string to the OCSP Updater. This is a little trickier, as the subsequent `UPDATE`s _are_ dependent on the output of the `SELECT`, but in this case it's operating on data batches, and a few seconds' replication latency are several orders of magnitude below the threshold for update frequency, so any certificates that aren't caught on run `n` can be caught on run `n+1`. Since we export DB metrics to Prometheus, this also refactors `InitDBMetrics` to take a DB Address (host:port tuple) and User out of the DB connection DSN and include those as labels in the metrics. Fixes #5550 Fixes #4985	2021-08-02 11:21:34 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Aaron Gable	6e6be607fa	Deprecate StoreIssuerInfo flag (#5386 ) This flag is no longer referenced by any code, and can be safely deprecated. Part of #5079	2021-04-13 17:18:01 -07:00
Jacob Hoffman-Andrews	b4e483d38b	Add gRPC MaxConnectionAge config. (#5311 ) This allows servers to tell clients to go away after some period of time, which triggers the clients to re-resolve DNS. Per grpc/grpc#12295, this is the preferred way to do this. Related: #5307.	2021-03-01 18:37:47 -08:00
Samantha	e2e7dad034	Move cmd.DBConfig fields to their own named sub-struct (#5286 ) Named field `DB`, in a each component configuration struct, acts as the receiver for the value of `db` when component JSON files are unmarshalled. When `cmd.DBConfig` fields are received at the root of component configuration struct instead of `DB` copy them to the `DB` field of the component configuration struct. Move existing `cmd.DBConfig` values from the root of each component's JSON configuration in `test/config-next` to `db` Part of #5275	2021-02-16 10:48:58 -08:00
Samantha	e0510056cc	Enhancements to SQL driver tuning via JSON config (#5235 ) Historically the only database/sql driver setting exposed via JSON config was maxDBConns. This change adds support for maxIdleConns, connMaxLifetime, connMaxIdleTime, and renames maxDBConns to maxOpenConns. The addition of these settings will give our SRE team a convenient method for tuning the reuse/closure of database connections. A new struct, DBSettings, has been added to SA. The struct, and each of it's fields has been commented. All new fields have been plumbed through to the relevant Boulder components and exported as Prometheus metrics. Tests have been added/modified to ensure that the fields are being set. There should be no loss in coverage Deployability concerns for the migration from maxDBConns to maxOpenConns have been addressed with the temporary addition of the helper method cmd.DBConfig.GetMaxOpenConns(). This method can be removed once test/config is defaulted to using maxOpenConns. Relevant sections of the code have TODOs added that link back to an newly opened issue. Fixes #5199	2021-01-25 15:34:55 -08:00

1 2 3

105 Commits