boulder

Commit Graph

Author	SHA1	Message	Date
Aaron Gable	10e894a172	Create new admin tool (#7276 ) Create a new administration tool "bin/admin" as a successor to and replacement of "admin-revoker". This new tool supports all the same fundamental capabilities as the old admin-revoker, including: - Revoking by serial, by batch of serials, by incident table, and by private key - Blocking a key to let bad-key-revoker take care of revocation - Clearing email addresses from all accounts that use them Improvements over the old admin-revoker include: - All commands run in "dry-run" mode by default, to prevent accidental executions - All revocation mechanisms allow setting the revocation reason, skipping blocking the key, indicating that the certificate is malformed, and controlling the number of parallel workers conducting revocation - All revocation mechanisms do not parse the cert in question, leaving that to the RA - Autogenerated usage information for all subcommands - A much more modular structure to simplify adding more capabilities in the future - Significantly simplified tests with smaller mocks The new tool has analogues of all of admin-revokers unit tests, and all integration tests have been updated to use the new tool instead. A future PR will remove admin-revoker, once we're sure SRE has had time to update all of their playbooks. Fixes https://github.com/letsencrypt/boulder/issues/7135 Fixes https://github.com/letsencrypt/boulder/issues/7269 Fixes https://github.com/letsencrypt/boulder/issues/7268 Fixes https://github.com/letsencrypt/boulder/issues/6927 Part of https://github.com/letsencrypt/boulder/issues/6840	2024-02-07 09:35:18 -08:00
Jacob Hoffman-Andrews	ce5632b480	Remove `service1` / `service2` names in consul (#7266 ) These names corresponded to single instances of a service, and were primarily used for (a) specifying which interface to bind a gRPC port on and (b) allowing `health-checker` to check individual instances rather than a service as a whole. For (a), change the `--grpc-addr` flags to bind to "all interfaces." For (b), provide a specific IP address and port for health checking. This required adding a `--hostOverride` flag for `health-checker` because the service certificates contain hostname SANs, not IP address SANs. Clarify the situation with nonce services a little bit. Previously we had one nonce "service" in Consul and got nonces from that (i.e. randomly between the two nonce-service instances). Now we have two nonce services in consul, representing multiple datacenters, and one of them is explicitly configured as the "get" service, while both are configured as the "redeem" service. Part of #7245. Note this change does not yet get rid of the rednet/bluenet distinction, nor does it get rid of all use of 10.88.88.88. That will be a followup change.	2024-01-22 09:34:20 -08:00
Matthew McPherrin	56c10c613c	Update zlint (#7252 ) Upgrade to zlint v3.6.0 Two new lints are triggered in various places: aia_contains_internal_names is ignored in integration test configurations, and unit tests are updated to have more realistic URLs. The w_subject_common_name_included lint needs to be ignored where we'd ignored n_subject_common_name_included before. Related to https://github.com/letsencrypt/boulder/issues/7261	2024-01-16 11:50:37 -08:00
Aaron Gable	d38b7b685b	Fix flaky integration test failures (#7262 ) This partially reverts commit `20b121138c`, which was landed in https://github.com/letsencrypt/boulder/pull/7254. Specifically, it reverts the addition of "noWaitForReady" to the health-checker's gRPC config. This appears to stop the flaky `last resolver error: produced zero addresses` failures we've been seeing in the CI integration tests.	2024-01-16 09:50:13 -08:00
Jacob Hoffman-Andrews	20b121138c	health-checker: bail early on handshake failure (#7254 ) When we have a problem with our authentication certificates, it's better to get a clear error early than to wait for health checker to time out. Also, set noWaitForReady in the config, which prevents detailed errors from being obscured by "timed out" errors.	2024-01-11 09:36:35 -08:00
Jacob Hoffman-Andrews	7b347dd6c3	Use different ports for instances of the same service (#7246 ) Part of #7245. This just provides a unique port for each instance, and breaks the service<->port mapping. A subsequent PR will move to listening on the same IP. Remove unused `-b` variants of crl-storer and akamai-purger. The new port scheme is that the first instance of a service is on `93xx` and the second instance of a service is on `94xx`. Part of a stacked change with #7243.	2024-01-10 14:32:33 -08:00
Jacob Hoffman-Andrews	cd3bbf91ad	test: move SRV stanzas from config-next to config (#7243 ) Service discovery via SRV records is now deployed in prod.	2024-01-10 10:31:23 -08:00
Phil Porada	2e951b0105	Remove ca-a and ca-b distinction in test configs (#7238 ) Fixes https://github.com/letsencrypt/boulder/issues/7187	2024-01-08 13:19:28 -08:00
Aaron Gable	6b54b61f21	Prevent serial prefixes from beginning with a 1 (#7214 ) Change the max value of the CA's `SerialPrefix` config value from 255 (a byte of all 1s) to 127 (a byte of one 0 followed by seven 1s). This prevents the serial prefix from ever beginning with a 1. This is important because serials are interpreted as signed (twos-complement) integers, and are required to be positive -- a serial whose first bit is 1 is considered to be negative and therefore in violation of RFC 5280. The go stdlib fixes this for us by prepending a zero byte to any serial that begins with a 1 bit, but we'd prefer all our serials to be the same length. Corresponding config change was completed in IN-9880.	2023-12-15 07:37:44 -08:00
Aaron Gable	97cba52e09	Remove deprecated and unused feature flags (#7207 ) These feature flags are no longer referenced in any test, staging, or production configuration. They were removed in: - StoreRevokerInfo: IN-8546 - ROCSPStage6 and ROCSPStage7: IN-8886 - CAAValidationMethods and CAAAccountURI: IN-9301	2023-12-13 13:53:31 -08:00
Aaron Gable	81cb970d30	Remove crlURL from test CA issuer configs (#7132 ) This value is always set to the empty string in prod, which (correctly) results in the issued certificates not having a CRLDP at all. It turns out our integration test environment has been including CRLDPs in all of our test certs because we set crlURL to a non-empty value! This change updates our test configs to match reality. I'll remove the code which supports this config value as part of my upcoming CA CRLDP changes.	2023-11-02 11:20:50 -07:00
Jacob Hoffman-Andrews	c84201c09a	observer: add TCP prober (#7118 ) This is potentially useful for diagnosing issues with connection timeouts, which could have separate causes from HTTP errors. For instance, a connection timeout is more likely to be caused by network congestion or high CPU usage on the load balancer.	2023-10-27 09:11:18 -07:00
Phil Porada	72e01b337a	ceremony: Distinguish between intermediate and cross-sign ceremonies (#7005 ) In `//cmd/ceremony`: * Added `CertificateToCrossSignPath` to the `cross-certificate` ceremony type. This new input field takes an existing certificate that will be cross-signed and performs checks against the manually configured data in each ceremony file. * Added byte-for-byte subject/issuer comparison checks to root, intermediate, and cross-certificate ceremonies to detect that signing is happening as expected. * Added Fermat factorization check from the `//goodkey` package to all functions that generate new key material. In `//linter`: * The Check function now exports linting certificate bytes. The idea is that a linting certificate's `tbsCertificate` bytes can be compared against the final certificate's `tbsCertificate` bytes as a verification that `x509.CreateCertificate` was deterministic and produced identical DER bytes after each signing operation. Other notable changes: * Re-orders the issuers list in each CA config to match staging and production. There is an ordering issue mentioned by @aarongable two years ago on IN-5913 that didn't make it's way back to this repository. > Order here matters – the default chain we serve for each intermediate should be the first listed chain containing that intermediate. * Enables `ECDSAForAll` in `config-next` CA configs to match Staging. * Generates 2x new ECDSA subordinate CAs cross-signed by an RSA root and adds these chains to the WFE for clients to download. * Increased the test.sh startup timeout to account for the extra ceremony run time. Fixes https://github.com/letsencrypt/boulder/issues/7003 --------- Co-authored-by: Aaron Gable <aaron@letsencrypt.org>	2023-08-23 14:01:19 -04:00
Aaron Gable	6a450a2272	Improve CRL shard leasing (#7030 ) Simplify the index-picking logic in the SA's leaseOldestCrlShard method. Specifically, more clearly separate it into "missing" and "non-missing" cases, which require entirely different logic: picking a random missing shard, or picking the oldest unleased shard, respectively. Also change the UpdateCRLShard method to "unlease" shards when they're updated. This allows the crl-updater to run as quickly as it likes, while still ensuring that multiple instances do not step on each other's toes. The config change for shardWidth and lookbackPeriod instead of certificateLifetime has been deployed in prod since IN-8445. The config change changing the shardWidth is just so that the tests neither produce a bazillion shards, nor have to do a bazillion SA queries for each chunk within a shard, improving the readability of test logs. Part of https://github.com/letsencrypt/boulder/issues/7023	2023-08-08 17:05:00 -07:00
Aaron Gable	9a4f0ca678	Deprecate LeaseCRLShards feature (#7009 ) This feature flag is enabled in both staging and prod.	2023-08-07 15:17:00 -07:00
Jacob Hoffman-Andrews	725f190c01	ca: remove orphan queue code (#7025 ) The `orphanQueueDir` config field is no longer used anywhere. Fixes #6551	2023-08-02 16:04:28 -07:00
Aaron Gable	8d8fd3731b	Remove VA.DNSResolver (#7001 ) I have confirmed that this config field is not set in any deployment environment. Fixes https://github.com/letsencrypt/boulder/issues/6868	2023-07-13 17:56:41 -07:00
Aaron Gable	158f62bd0c	Remove policy qualifiers from all issuance paths (#6980 ) The inclusion of Policy Qualifiers inside Policy Information elements of a Certificate Policies extension is now NOT RECOMMENDED by the Baseline Requirements. We have already removed these fields from all of our Boulder configuration, and ceased issuing certificates with Policy Qualifiers. Remove all support for configuring and including Policy Qualifiers in our certificates, both in Boulder's main issuance path and in our ceremony tool. Switch from using the policyasn1 library to manually encode these extensions, to using the crypto/x509's Certificate.PolicyIdentifiers field. Delete the policyasn1 package as it is no longer necessary. Fixes https://github.com/letsencrypt/boulder/issues/6880	2023-07-13 10:37:05 -07:00
Jacob Hoffman-Andrews	2041e8723b	integration: shorten log output (#6894 ) Remove the load test stage of the integration test, which generates superfluous amounts of log. Turn down logging on the CA and VA from info to error-only. Part of https://github.com/letsencrypt/boulder/issues/6890	2023-06-05 13:11:19 -04:00
Jacob Hoffman-Andrews	80e1510819	admin: add clear-email subcommand (#6919 ) When a user wants their email address deleted from the database but no longer has access to their account, this allows an administrator to clear it. This adds `admin` as an alias for `admin-revoker`, because we'd like the clear-email sub-command to be a part of that overall tool, but it's not really revocation related. Part of #6864	2023-06-01 14:33:24 -04:00
Samantha	f09a94bd74	consul: Configure gRPC health check for SA (#6908 ) Enable SA gRPC health checks in Consul ahead of further changes for #6878. Calls to the `Check` method of the SA's grpc.health.v1.Health service must respond `SERVING` before the `sa` service will be advertised in Consul DNS. Consul will continue to poll this service every 5 seconds. - Add `bconsul` docker service to boulder `bluenet` and `rednet` - Add TLS credentials for `consul.boulder`: ```shell $ openssl x509 -in consul.boulder/cert.pem -text \| grep DNS DNS:consul.boulder ``` - Update `test/grpc-creds/generate.sh` to add `consul.boulder` - Update test SA configs to allow `consul.boulder` to access to `grpc.health.v1.Health` Part of #6878	2023-05-23 13:16:49 -04:00
Samantha	c453ca0571	grpc: Deprecate clientNames field (#6870 ) - SRE removed in IN-8755 Fixes #6698	2023-05-08 14:49:27 -04:00
Samantha	c9173cc024	boulder-va: Remove deprecated Common fields stanza (#6871 ) - SRE removed in IN-8752. Fixes #6716	2023-05-08 11:47:17 -04:00
Jacob Hoffman-Andrews	1c7e0fd1d8	Store linting certificate instead of precertificate (#6807 ) In order to get rid of the orphan queue, we want to make sure that before we sign a precertificate, we have enough data in the database that we can fulfill our revocation-checking obligations even if storing that precertificate in the database fails. That means: - We should have a row in the certificateStatus table for the serial. - But we should not serve "good" for that serial until we are positive the precertificate was issued (BRs 4.9.10). - We should have a record in the live DB of the proposed certificate's public key, so the bad-key-revoker can mark it revoked. - We should have a record in the live DB of the proposed certificate's names, so it can be revoked if we are required to revoke based on names. The SA.AddPrecertificate method already achieves these goals for precertificates by writing to the various metadata tables. This PR repurposes the SA.AddPrecertificate method to write "proposed precertificates" instead. We already create a linting certificate before the precertificate, and that linting certificate is identical to the precertificate that will be issued except for the private key used to sign it (and the AKID). So for instance it contains the right pubkey and SANs, and the Issuer name is the same as the Issuer name that will be used. So we'll use the linting certificate as the "proposed precertificate" and store it to the DB, along with appropriate metadata. In the new code path, rather than writing "good" for the new certificateStatus row, we write a new, fake OCSP status string "wait". This will cause us to return internalServerError to OCSP requests for that serial (but we won't get such requests because the serial has not yet been published). After we finish precertificate issuance, we update the status to "good" with SA.SetCertificateStatusReady. Part of #6665	2023-04-26 13:54:24 -07:00
Aaron Gable	45329c9472	Deprecate ROCSPStage7 flag (#6804 ) Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to stop generating OCSP responses when issuing new certs and when revoking certs. (That functionality is now handled just-in-time by the ocsp-responder.) Delete the old OCSP-generating codepaths from the RA and CA. Remove the CA's internal reference to an OCSP implementation, because it no longer needs it. Additionally, remove the SA's "Issuers" config field, which was never used. Fixes #6285	2023-04-12 17:03:06 -07:00
Aaron Gable	e55a276efe	CA: Remove deprecated config stanzas (#6595 ) These config stanzas have been removed in staging and prod. They used to configure the separate OCSP and CRL gRPC services provided by the CA process, but the CA now provides those services on the same port as the main CA gRPC service. Fixes #6448	2023-04-07 09:37:34 -07:00
Aaron Gable	7e994a1216	Deprecate ROCSPStage6 feature flag (#6770 ) Deprecate the ROCSPStage6 feature flag. Remove all references to the `ocspResponse` column from the SA, both when reading from and when writing to the `certificateStatus` table. This makes it safe to fully remove that column from the database. IN-8731 enabled this flag in all environments, so it is safe to deprecate. Part of #6285	2023-04-04 15:41:51 -07:00
Aaron Gable	8c67769be4	Remove ocsp-updater from Boulder (#6769 ) Delete the ocsp-updater service, and the //ocsp/updater library that supports it. Remove test configs for the service, and remove references to the service from other test files. This service has been fully shut down for an extended period now, and is safe to remove. Fixes #6499	2023-03-31 14:39:04 -07:00
Aaron Gable	22fd579cf2	ARI: write Retry-After header before body (#6787 ) When sending an ARI response, write the Retry-After header before writing the JSON response body. This is necessary because http.ResponseWriter implicitly calls WriteHeader whenever Write is called, flushing all headers to the network and preventing any additional headers from being written. Unfortunately, the unittests use httptest.ResponseRecorder, which doesn't seem to enforce this invariant (it's happy to report headers which were written after the body). Add a header check to the integration tests, to make up for this deficiency.	2023-03-31 10:48:45 -07:00
Matthew McPherrin	49851d7afd	Remove Beeline configuration (#6765 ) In a previous PR, #6733, this configuration was marked deprecated pending removal. Here is that removal.	2023-03-23 16:58:36 -04:00
Samantha	b2224eb4bc	config: Add validation tags to all configuration structs (#6674 ) - Require `letsencrypt/validator` package. - Add a framework for registering configuration structs and any custom validators for each Boulder component at `init()` time. - Add a `validate` subcommand which allows you to pass a `-component` name and `-config` file path. - Expose validation via exported utility functions `cmd.LookupConfigValidator()`, `cmd.ValidateJSONConfig()` and `cmd.ValidateYAMLConfig()`. - Add unit test which validates all registered component configuration structs against test configuration files. Part of #6052	2023-03-21 14:08:03 -04:00
Matthew McPherrin	05c9106eba	lints: Consistently format JSON configuration files (#6755 ) - Consistently format existing test JSON config files - Add a small Python script which loads and dumps JSON files - Add CI JSON lint test to CI --------- Co-authored-by: Aaron Gable <aaron@aarongable.com>	2023-03-20 18:11:19 -04:00
Samantha	8440a47d0b	expiration-mailer: Remove Config.NagCheckInterval (#6712 ) Fixes #6097 Part of #6052 Blocks #6674	2023-03-01 15:45:18 -05:00
Aaron Gable	29bf521121	CA: Remove secondary gRPC servers (#6496 ) Remove the OCSPGenerator and CRLGenerator gRPC servers that run on separate ports from the CA's main gRPC server, which exposes both those and the CertificateAuthority service as well. These additional servers are no longer necessary, now that all three services are exposed on the single address/port. Fixes #6448	2023-03-01 11:45:28 -08:00
Phil Porada	fdb9c543b7	Remove ReuseValidAuthz code (#6686 ) Removes all code related to the `ReuseValidAuthz` feature flag. The Boulder default is to now always reuse valid authorizations. Fixes a panic in `test.AssertErrorIs` when `err` is unexpectedly `nil` that was found this while reworking the `TestPerformValidationAlreadyValid` test. The go stdlib `func Is`[1] does not check for this. 1. https://go.dev/src/errors/wrap.go Part 2/2, fixes https://github.com/letsencrypt/boulder/issues/2734	2023-02-28 17:57:16 -05:00
Aaron Gable	427bced0cd	Remove OCSP and CRL methods from CA gRPC service (#6474 ) Remove the GenerateOCSP and GenerateCRL methods from the CertificateAuthority gRPC service. These methods are no longer called by any clients; all clients use their respective OCSPGenerator and CRLGenerator gRPC services instead. In addition, remove the CRLGeneratorServer field from the caImpl, as it no longer needs it to serve as a backing implementation for the GenerateCRL pass-through method. Unfortunately, we can't remove the OCSPGeneratorServer field until after ROCSPStage7 is complete, and the CA is no longer generating an OCSP response during initial certificate issuance. Part of #6448	2023-02-23 14:42:14 -08:00
Jacob Hoffman-Andrews	cd1bbc0d82	Tidy up integration test environment (#6668 ) Remove `example.com` domain name, which was used by the deleted OldTLS tests. Remove GODEBUG=x509sha1=1. Add a longer comment for the Consul DNS fallback in docker-compose.yml. Use the "dnsAuthority" field for all gRPC clients in config-next, instead of implicitly relying on the system DNS. This matches what we do in prod. Make "dnsAuthority" field of GRPCClientConfig mandatory whenever SRVLookup or SRVLookups is used. Make test/config/ocsp-responder.json use ServerAddress instead of SRVLookup, like the rest of test/config.	2023-02-16 09:33:24 -08:00
Samantha	5c49231ea6	ROCSP: Remove support for Redis Cluster (#6645 ) Fixes #6517	2023-02-09 17:14:37 -05:00
Samantha	d73125d8f6	WFE: Add custom balancer implementation which routes nonce redemption RPCs by prefix (#6618 ) Assign nonce prefixes for each nonce-service by taking the first eight characters of the the base64url encoded HMAC-SHA256 hash of the RPC listening address using a provided key. The provided key must be same across all boulder-wfe and nonce-service instances. - Add a custom `grpc-go` load balancer implementation (`nonce`) which can route nonce redemption RPC messages by matching the prefix to the derived prefix of the nonce-service instance which created it. - Modify the RPC client constructor to allow the operator to override the default load balancer implementation (`round_robin`). - Modify the `srv` RPC resolver to accept a comma separated list of targets to be resolved. - Remove unused nonce-service `-prefix` flag. Fixes #6404	2023-02-03 17:52:18 -05:00
Phil Porada	3866e4f60d	VA: Use default PortConfig during testing (#6609 ) Part of #3940	2023-01-25 16:16:08 -05:00
Phil Porada	aae4175186	Remove deprecated feature flags (#6566 ) Remove deprecated feature flags. Fixes #6559	2023-01-23 20:56:15 -05:00
Aaron Gable	ba34ac6b6e	Use read-only SA clients in wfe, ocsp, and crl (#6484 ) In the WFE, ocsp-responder, and crl-updater, switch from using StorageAuthorityClients to StorageAuthorityReadOnlyClients. This ensures that these services cannot call methods which write to our database. Fixes #6454	2022-12-02 13:48:28 -08:00
Aaron Gable	7517b0d80f	Rehydrate CAA account and method binding (#6501 ) Make minor changes to our implementation of CAA Account and Method Binding, as a result of reviewing the code in preparation for enabling it in production. Specifically: - Ensure that the validation method and account ID are included at the request level, rather than waiting until we perform the checks which use those parameters; - Clean up code which assumed the validation method and account ID might not be populated; - Use the core.AcmeChallenge type (rather than plain string) for the validation method everywhere; - Update comments to reference the latest version and correct sections of the CAA RFCs; and - Remove the CAA feature flags from the config integration tests to reflect that they are not yet enabled in prod. I have reviewed this code side-by-side with RFC 8659 (CAA) and RFC 8657 (ACME CAA Account and Method Binding) and believe it to be compliant with both.	2022-11-17 13:31:04 -08:00
Aaron Gable	4f473edfa8	Deprecate 10 feature flags (#6502 ) Deprecate these feature flags, which are consistently set in both prod and staging and which we do not expect to change the value of ever again: - AllowReRevocation - AllowV1Registration - CheckFailedAuthorizationsFirst - FasterNewOrdersRateLimit - GetAuthzReadOnly - GetAuthzUseIndex - MozRevocationReasons - RejectDuplicateCSRExtensions - RestrictRSAKeySizes - SHA1CSRs Move each feature flag to the "deprecated" section of features.go. Remove all references to these feature flags from Boulder application code, and make the code they were guarding the only path. Deduplicate tests which were testing both the feature-enabled and feature-disabled code paths. Remove the flags from all config-next JSON configs (but leave them in config ones until they're fully deleted, not just deprecated). Finally, replace a few testdata CSRs used in CA tests, because they had SHA1WithRSAEncryption signatures that are now rejected. Fixes #5171 Fixes #6476 Part of #5997	2022-11-14 09:24:50 -08:00
Aaron Gable	4466c953de	CA: Expose all gRPC services on single address (#6495 ) Now that we have the ability to easily add multiple gRPC services to the same server, and control access to each service individually, use that capability to expose the CA's CertificateAuthority, OCSPGenerator, and CRLGenerator services all on the same address/port. This will make establishing connections to the CA easier, but no less secure. Part of #6448	2022-11-08 15:28:59 -08:00
Samantha	b35fe81d7b	ctpolicy: Remove deprecated codepath and fix metrics (#6485 ) - Remove deprecated code for #5938 - Fix broken metrics flagged in #6435 - Make CT operator and log selection random Fixes #6435 Fixes #5938 Fixes #6486	2022-11-07 11:31:20 -08:00
Samantha	6d519059a3	akamai-purger: Deprecate PurgeInterval config field (#6489 ) Fixes #6003	2022-11-04 12:44:35 -07:00
Aaron Gable	868214b85e	CRLs: include IssuingDistributionPoint extension (#6412 ) Add the Issuing Distribution Point extension to all of our end-entity CRLs. The extension contains the Distribution Point, the URL from which this CRL is meant to be downloaded. Because our CRLs are sharded, this URL prevents an on-path attacker from substituting a different shard than the client expected in order to hide a revocation. The extension also contains the OnlyContainsUserCerts boolean, because our CRLs only contain end-entity certificates. The Distribution Point url is constructed from a configurable base URI, the issuer's NameID, the shard index, and the suffix ".crl". The base URI must use the "http://" scheme and must not end with a slash. openssl displays the IDP extension as: ``` X509v3 Issuing Distribution Point: critical Full Name: URI:http://c.boulder.test/66283756913588288/0.crl Only User Certificates ``` Fixes #6410	2022-10-24 11:21:55 -07:00
Aaron Gable	30d8f19895	Deprecate ROCSP Stage 1, 2, and 3 flags (#6460 ) These flags are set in both staging and prod. Deprecate them, make all code gated behind them the only path, and delete code (multi_source) which was only accessible when these flags were not set. Part of #6285	2022-10-21 14:58:34 -07:00
Samantha	9c12e58c7b	grpc: Allow static host override in client config (#6423 ) - Add a new gRPC client config field which overrides the dNSName checked in the certificate presented by the gRPC server. - Revert all test gRPC credentials to `<service>.boulder` - Revert all ClientNames in gRPC server configs to `<service>.boulder` - Set all gRPC clients in `test/config` to use `serverAddress` + `hostOverride` - Set all gRPC clients in `test/config-next` to use `srvLookup` + `hostOverride` - Rename incorrect SRV record for `ca` with port `9096` to `ca-ocsp` - Rename incorrect SRV record for `ca` with port `9106` to `ca-crl` Resolves #6424	2022-10-03 15:23:55 -07:00
Samantha	90eb90bdbe	test: Replace sd-test-srv with consul (#6389 ) - Add a dedicated Consul container - Replace `sd-test-srv` with Consul - Add documentation for configuring Consul - Re-issue all gRPC credentials for `<service-name>.service.consul` Part of #6111	2022-09-19 16:13:53 -07:00
Jacob Hoffman-Andrews	db044a8822	log: fix spurious honeycomb warnings; improve stdout logger (#6364 ) Honeycomb was emitting logs directly to stderr like this: ``` WARN: Missing API Key. WARN: Dataset is ignored in favor of service name. Data will be sent to service name: boulder ``` Fix this by providing a fake API key and replacing "dataset" with "serviceName" in configs. Also add missing Honeycomb configs for crl-updater. For stdout-only logger, include checksums and escape newlines.	2022-09-14 11:25:02 -07:00
Aaron Gable	7f189f7a3b	Improve how crl-updater formats and surfaces errors (#6369 ) Make every function in the Run -> Tick -> tickIssuer -> tickShard chain return an error. Make that return value a named return (which we usually avoid) so that we can remove the manual setting of the metric result label and have the deferred metric handling function take care of that instead. In addition, let that cleanup function wrap the returned error (if any) with the identity of the shard, issuer, or tick that is returning it, so that we don't have to include that info in every individual error message. Finally, have the functions which spin off many helpers (Tick and tickIssuer) collect all of their helpers' errors and only surface that error at the end, to ensure the process completes even in the presence of transient errors. In crl-updater's main, surface the error returned by Run or Tick, to make debugging easier.	2022-09-12 11:36:42 -07:00
Aaron Gable	78fbda1cd2	Enable CRL test in config integration tests (#6368 ) Now that both crl-updater and crl-storer are running in prod, run this integration test in both test environments as well. In addition, remove the fake storer grpc client that the updater used when no storer client was configured, as storer clients are now configured in all environments.	2022-09-09 16:03:49 -07:00
Jacob Hoffman-Andrews	6ad06789d9	rocsp-tool: add "get-pem" output (#6317 ) Emit PEM output instead of pretty-printed output. Send the pretty-printed output straight to stdout instead of via a logger, so the internal newlines don't get escaped. Fixes #6310	2022-08-25 12:52:58 -07:00
Aaron Gable	c1be8cfc52	crl-storer: load whole AWS config files (#6309 ) Allow the crl-storer to load whole AWS config files. Although this requires a deployment to maintain an additional config files for the crl-storer, and one in a format we usually don't use, it does give us lots of flexibility in setting up things like role assumption. Also remove the S3Region config flag, as it is now redundant with the contents of the config file, and rename the existing S3CredsFile config key to AWSCredsFile to better represent its true contents. Fixes #6308	2022-08-23 11:04:12 -07:00
Aaron Gable	b001af71e8	Add new services to log-validator test config (#6303 ) Fixes #6289	2022-08-17 16:46:11 -07:00
Aaron Gable	6a9bb399f7	Create new crl-storer service (#6264 ) Create a new crl-storer service, which receives CRL shards via gRPC and uploads them to an S3 bucket. It ignores AWS SDK configuration in the usual places, in favor of configuration from our standard JSON service config files. It ensures that the CRLs it receives parse and are signed by the appropriate issuer before uploading them. Integrate crl-updater with the new service. It streams bytes to the crl-storer as it receives them from the CA, without performing any checking at the same time. This new functionality is disabled if the crl-updater does not have a config stanza instructing it how to connect to the crl-storer. Finally, add a new test component, the s3-test-srv. This acts similarly to the existing mail-test-srv: it receives requests, stores information about them, and exposes that information for later querying by the integration test. The integration test uses this to ensure that a newly-revoked certificate does show up in the next generation of CRLs produced. Fixes #6162	2022-08-08 16:22:48 -07:00
Aaron Gable	694d73d67b	crl-updater: add UpdateOffset config to run on a schedule (#6260 ) Add a new config key `UpdateOffset` to crl-updater, which causes it to run on a regular schedule rather than running immediately upon startup and then every `UpdatePeriod` after that. It is safe for this new config key to be omitted and take the default zero value. Also add a new command line flag `runOnce` to crl-updater which causes it to immediately run a single time and then exit, rather than running continuously as a daemon. This will be useful for integration tests and emergency situations. Part of #6163	2022-07-29 13:30:16 -07:00
Aaron Gable	9ae16edf51	Fix race condition in revocation integration tests (#6253 ) Add a new filter to mail-test-srv, allowing test processes to query for messages sent from a specific address, not just ones sent to a specific address. This fixes a race condition in the revocation integration tests where the number of messages sent to a cert's contact address would be higher than expected because expiration mailer sent a message while the test was running. Also reduce bad-key-revoker's maximum backoff to 2 seconds to ensure that it continues to run frequently during the integration tests, despite usually not having any work to do. While we're here, also improve the comments on various revocation integration tests, remove some unnecessary cruft, and split the tests out to explicitly test functionality with the MozRevocationReasons flag both enabled and disabled. Also, change ocsp_helper's default output from os.Stdout to ioutil.Discard to prevent hundreds of lines of log spam when the integration tests fail during a test that uses that library. Fixes #6248	2022-07-29 09:23:50 -07:00
Jacob Hoffman-Andrews	3b09571e70	ocsp-responder: add LiveSigningPeriod (#6237 ) Previously we used "ExpectedFreshness" to control how frequently the Redis source would request re-signing of stale entries. But that field also controls whether multi_source is willing to serve a MariaDB response. It's better to split these into two values.	2022-07-20 15:36:38 -07:00
Aaron Gable	436061fb35	CRL: Create crl-updater service (#6212 ) Create a new service named crl-updater. It is responsible for maintaining the full set of CRLs we issue: one "full and complete" CRL for each currently-active Issuer, split into a number of "shards" which are essentially CRLs with arbitrary scopes. The crl-updater is modeled after the ocsp-updater: it is a long-running standalone service that wakes up periodically, does a large amount of work in parallel, and then sleeps. The period at which it wakes to do work is configurable. Unlike the ocsp-responder, it does all of its work every time it wakes, so we expect to set the update frequency at 6-24 hours. Maintaining CRL scopes is done statelessly. Every certificate belongs to a specific "bucket", given its notAfter date. This mapping is generally unchanging over the life of the certificate, so revoked certificate entries will not be moving between shards upon every update. The only exception is if we change the number of shards, in which case all of the bucket boundaries will be recomputed. For more details, see the comment on `getShardBoundaries`. It uses the new SA.GetRevokedCerts method to collect all of the revoked certificates whose notAfter timestamps fall within the boundaries of each shard's time-bucket. It uses the new CA.GenerateCRL method to sign the CRLs. In the future, it will send signed CRLs to the crl-storer to be persisted outside our infrastructure. Fixes #6163	2022-07-08 09:34:51 -07:00
Jacob Hoffman-Andrews	fda4124471	expiration-mailer: truncate serials and dns names (#6148 ) This avoids sending excessively large emails and excessively large log lines. Fixes #6085	2022-06-14 15:48:00 -07:00
Aaron Gable	f7ab64f05b	Remove last references to CFSSL (#6155 ) Just a docs and config cleanup.	2022-06-14 14:22:34 -07:00
Jacob Hoffman-Andrews	4467cf27db	Update config from config-next (#6051 ) This copies over settings from config-next that are now deployed in prod. Also, I updated a comment in sd-test-srv to more accurately describe how SRV records work.	2022-04-19 12:10:26 -07:00
Samantha	7c22b99d63	akamai-purger: Improve throughput and configuration safety (#6006 ) - Add new configuration key `throughput`, a mapping which contains all throughput related akamai-purger settings. - Deprecate configuration key `purgeInterval` in favor of `purgeBatchInterval` in the new `throughput` configuration mapping. - When no `throughput` or `purgeInterval` is provided, the purger uses optimized default settings which offer 1.9x the throughput of current production settings. - At startup, all throughput related settings are modeled to ensure that we don't exceed the limits imposed on us by Akamai. - Queue is now `[][]string`, instead of `[]string`. - When a given queue entry is purged we know all 3 of it's URLs were purged. - At startup we know the size of a theoretical request to purge based on the number of queue entries included - Raises the queue size from ~333-thousand cached OCSP responses to 1.25-million, which is roughly 6 hours of work using the optimized default settings - Raise `purgeInterval` in test config from 1ms, which violates API limits, to 800ms Fixes #5984	2022-03-23 17:23:07 -07:00
Samantha	3e9eaf84ea	rocsp-tool: Add syslog support (#6010 ) Add a logging stanza to rocsp-tool's config, and initialize a boulder logger rather than using Go's default log facilities. Fixes #5976	2022-03-21 14:51:56 -07:00
Aaron Gable	910dde95f6	Clean up goodkey configs (#5993 ) Fixes https://github.com/letsencrypt/boulder/issues/5851	2022-03-15 15:26:19 -07:00
Andrew Gabbitas	d006588f46	Orphan finder: Fix redundant syslog config value (#5971 ) Replace redundant stdoutlevel with a sysloglevel value in test configs.	2022-02-24 14:24:03 -08:00
Aaron Gable	5c02deabfb	Remove wfe1 integration tests (#5840 ) These tests are testing functionality that is no longer in use in production deployments of Boulder. As we go about removing wfe1 functionality, these tests will break, so let's just remove them wholesale right now. I have verified that all of the tests removed in this PR are duplicated against wfe2. One of the changes in this PR is to cease starting up the wfe1 process in the integration tests at all. However, that component was serving requests for the AIA Issuer URL, which gets queried by various OCSP and revocation tests. In order to keep those tests working, this change also adds an integration-test-only handler to wfe2, and updates the CA configuration to point at the new handler. Part of #5681	2021-12-10 12:40:22 -08:00
Aaron Gable	316ebb44ea	Enable GetAuthzReadOnly flag in prod tests (#5824 ) This flag has been enabled in prod. Not deprecating it yet because it hasn't been live for very long.	2021-12-01 14:47:51 -08:00
Jacob Hoffman-Andrews	2b21586573	rocsp-tool: cursor scans in load-from-db (#5821 ) This is necessary because if a single query response gets too big, MariaDB will terminate it.	2021-12-01 13:41:17 -08:00
Jacob Hoffman-Andrews	4f1934af82	Add load-from-db support to rocsp-tool (#5778 ) This scans the database for certificateStatus rows, gets them signed by the CA, and writes them to Redis. Also, bump the default PoolSize for Redis to 100.	2021-11-08 17:35:10 -08:00
Jacob Hoffman-Andrews	7fab32a000	Add rocsp-tool to manually store OCSP responses in Redis (#5758 ) This is a sort of proof of concept of the Redis interaction, which will evolve into a tool for inspection and manual repair of missing entries, if we find ourselves needing to do that. The important bits here are rocsp/rocsp.go and cmd/rocsp-tool/main.go. Also, the newly-vendored Redis client.	2021-11-02 11:04:03 -07:00
Jacob Hoffman-Andrews	ba0ea090b2	integration: save hierarchy across runs (#5729 ) This allows repeated runs using the same hiearchy, and avoids spurious errors from ocsp-updater saying "This CA doesn't have an issuer cert with ID XXX" Fixes #5721	2021-10-20 17:06:33 -07:00
Jacob Hoffman-Andrews	dc742fc320	Fix expiration-mailer integration test locally. (#5719 ) The expiration mailer processes certificates in batches of size `certLimit` (default 100). In production, it runs in daemon mode, so it will go on to the next batch when the current one is done. However, in local integration tests we rely on it getting all its work done in a single run. This works when you're running from a clean slate, but if you've run integration tests a bunch of times, there will be a bunch of certificates from previous runs that clog up the queue, and it won't send mail for the specific certificate the integration test is looking for. Solution: Set `certLimit` very high in the config. Also, update the default times for sending mail to match what we have in prod.	2021-10-18 19:51:34 -07:00
Aaron Gable	3f3f250212	Sync RA feature flags (#5678 ) These flags are enabled in both prod and staging, so let's enable them in our integration tests.	2021-09-30 11:00:41 -07:00
Aaron Gable	e0c3e2c1df	Reject unrecognized config keys (#5649 ) Instead of using the default `json.Unmarshal`, explicitly construct and use a `json.Decoder` so that we can set the `DisallowUnknownFields` flag on the decoder. This causes any unrecognized config keys to result in errors at boulder startup time. Fixes #5643	2021-09-24 10:13:44 -07:00
Aaron Gable	8a70bff2b4	Deprecate cert-checker CLI flags (#5511 ) Throw away the result of parsing various command-line flags in cert-checker. Leave the flags themselves in place to avoid breaking any scripts which pass them, but only respect the values provided by the config file. Part of #5489	2021-08-16 10:12:27 -07:00
Aaron Gable	aad7fae228	Synchronize test configs for deployed changes (#5574 ) These config changes have been deployed in prod, and can be synchronized between our config and config-next test environments.	2021-08-16 08:43:16 -07:00
Aaron Gable	1c6842cf69	Delete expired-authz-purger2 (#5570 ) Delete the expired-authz-purger2 binary, as well as the various config files, tests, and test helpers that exist to support it. This utility is no longer necessary, as it has not been running for quite some time, and we have developed alternative means of keeping the growth of the authz table under control. Fixes #5568	2021-08-11 14:39:57 -07:00
Aaron Gable	ac3e5e70c4	Delete boulder-janitor (#5571 ) Delete the boulder-janitor binary, and the various configs and tests which exist to support it. This tool has not been actively running in quite some time. The tables which is covers are either supported by our more recent partitioning methods, or are rate-limit tables that we hope to move out of mysql entirely. The cost of maintaining the janitor is not offset by the benefits it brings us (or the lack thereof). Fixes #5569	2021-08-11 11:10:24 -07:00
Aaron Gable	20f1bf1d0d	Compute validity periods inclusive of notAfter second (#5494 ) In the CA, compute the notAfter timestamp such that the cert is actually valid for the intended duration, not for one second longer. In the Issuance library, compute the validity period by including the full length of the final second indicated by the notAfter date when determining if the certificate request matches our profile. Update tests and config files to match. Fixes #5473	2021-06-24 13:17:29 -07:00
Aaron Gable	6e1357efa3	Update boulder test validity period to match prod (#5493 ) In prod, the CA is now configured to issue certificates with notAfter timestamps 7775999 seconds after their notBefore timestamp, and to enforce that same difference when validating issuance requests. Update our test configs to match.	2021-06-16 18:08:57 -07:00
Samantha	be1c24165e	test: Fix uppercase ECDSAAllowListFilename in test JSON configs (#5487 )	2021-06-16 14:24:30 -07:00
Andrew Gabbitas	b5aab29407	Make boulder-observer HTTP User-Agent configurable (#5484 ) - Make User-Agent configurable in config file - Fix README example - Add tests	2021-06-14 11:08:18 -06:00
Samantha	d574b50c41	CA: Deprecate field ECDSAAllowedAccounts (#5477 ) - Remove field `ECDSAAllowedAccounts` from CA - Remove `ECDSAAllowedAccounts` from CA tests - Replace `ECDSAAllowedAccounts` with `ECDSAAllowListFilename` in `test/config/ca-a.json` and `test/config/ca-b.json` - Add YAML allow list file at `test/config/ecdsaAllowList.yml` Fixes #5394	2021-06-11 12:13:01 -07:00
Samantha	6955df0f56	contact-auditor: Add tool to audit registration contacts (#5425 ) Add tool to audit subscriber registrations for e-mail addresses that `notify-mailer` is currently configured to skip. - Add `cmd/contact-auditor` with README - Add test coverage for `cmd/contact-auditor` - Add config file at `test/config/contact-auditor` Part of #5372	2021-06-07 14:21:54 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Samantha	1f19eee55b	CA: Fix startup bug caused by ECDSA allow list reloader (#5412 ) Solve a nil pointer dereference of `ecdsaAllowList` in `boulder-ca` by calling `reloader.New()` in constructor `ca.NewECDSAAllowListFromFile` instead. - Add missing entry `ECDSAAllowListFilename` to `test/config-next/ca-a.json` and `test/config-next/ca-b.json` - Add missing file ecdsaAllowList.yml to `test/config-next` - Add missing entry `ECDSAAllowedAccounts` to `test/config/ca-a.json` and `test/config/ca-b.json` - Move creation of the reloader to `NewECDSAAllowListFromFile` Fixes #5414	2021-05-17 14:41:15 -07:00
Aaron Gable	6e6be607fa	Deprecate StoreIssuerInfo flag (#5386 ) This flag is no longer referenced by any code, and can be safely deprecated. Part of #5079	2021-04-13 17:18:01 -07:00
Aaron Gable	b246d9cc45	Remove certDER OCSP generation code path from CA (#5117 ) Only process OCSP generation requests which are identified by the certificate's serial number and the ID (not NameID, unfortunately) of its issuer. Delete the code path which handled OCSP generation for requests identified by the full DER of the certificate in question. Update existing tests to use serial+id to request OCSP, and move test cases from the old `TestGenerateOCSPWithIssuerID` into the default test method. Part of #5079	2021-04-09 16:08:05 -07:00
Samantha	35340ff67a	Move expired-authz-purger2 config to test directory (#5352 ) - Edit integration test to start expired-authz-purger2 with config/ config-next - Move config from `cmd/expired-authz-purger2/config.json` to `test/config/expired-authz-purger2.json` - Add a copy of `test/config/expired-authz-purger2.json` to `test/config-next/` Fixes #5351	2021-03-18 17:56:25 -07:00
Aaron Gable	f569b15b64	Remove common config from ocsp-responder (#5350 ) The old `config.Common.IssuerCert` format is no longer used in any production configs, and can be removed safely. Part of #5162 Part of #5242	2021-03-18 17:16:37 -07:00
Aaron Gable	91473b384b	Remove common config from publisher (#5353 ) The old `config.Common.CT.IntermediateBundleFilename` format is no longer used in any production configs, and can be removed safely. Part of #5162 Part of #5242 Fixes #5269	2021-03-18 16:59:06 -07:00
Samantha	5a92926b0c	Remove dbconfig migration deployability code (#5348 ) Default boulder code paths to exclusively use the `db` config key Fixes #5338	2021-03-18 16:41:15 -07:00
Aaron Gable	bae699fae1	Update CA test config to use NonCFSSLSigner (#5344 ) This config is now live in production. Part of #5115	2021-03-17 09:41:02 -07:00
Samantha	7cb0038498	Deprecate MaxDBConns for MaxOpenConns (#5274 ) In #5235 we replaced MaxDBConns in favor of MaxOpenConns. One week ago MaxDBConns was removed from all dev, staging, and production configurations. This change completes the removal of MaxDBConns from all components and test/config. Fixes #5249	2021-02-08 12:00:01 -08:00
Aaron Gable	379826d4b5	WFE2: Improve support for multiple issuers & chains (#5247 ) This change simplifies and hardens the wfe2's support for having multiple issuers, and multiple chains for each issuer, configured and loaded in memory. The only config-visible change is replacing the old two separate config values (`certificateChains` and `alternateCertificateChains`) with a single value (`chains`). This new value does not require the user to know and hand-code the AIA URLs at which the certificates are available; instead the chains are simply presented as lists of files. If this new config value is present, the old config values will be ignored; if it is not, the old config values will be respected. Behind the scenes, the chain loading code has been completely changed. Instead of loading PEM bytes directly from the file, and then asserting various things (line endings, no trailing bits, etc) about those bytes, we now parse a certificate from the file, and in-memory recreate the PEM from that certificate. This approach allows the file loading to be much more forgiving, while also being stricter: we now check that each certificate in the chain is correctly signed by the next cert, and that the last cert in the chain is a self-signed root. Within the WFE itself, most of the internal structure has been retained. However, both the internal `issuerCertificates` (used for checking that certs we are asked to revoke were in fact issued by us) and the `certificateChains` (used to append chains to end-entity certs when served to clients) have been updated to be maps keyed by IssuerNameID. This allows revocation checking to not have to iterate through the whole list of issuers, and also makes it easy to double-check that the signatures on end-entity certs are valid before serving them. Actual checking of the validity will come in a follow-up change, due to the invasive nature of the necessary test changes. Fixes #5164	2021-01-27 15:07:58 -08:00
Aaron Gable	beee17c510	Janitor: refactor to be controlled by config (#5195 ) Previously, configuration of the boulder-janitor was split into two places: the actual json config file (which controlled which jobs would be enabled, and what their rate limits should be), and the janitor code itself (which controlled which tables and columns those jobs should query). This resulted in significant duplicated code, as most of the jobs were identical except for their table and column names. This change abstracts away the query which jobs use to find work. Instead of having each job type parse its own config and produce its own work query (in Go code), now each job supplies just a few key values (the table name and two column names) in its JSON config, and the Go code assembles the appropriate query from there. We are able to delete all of the files defining individual job types, and replace them with a single slightly smarter job constructor. This enables further refactorings, namely: * Moving all of the logic code into its own module; * Ensuring that the exported interface of that module is safe (i.e. that a client cannot create and run jobs without them being valid, because the only exposed methods ensure validity); * Collapsing validity checks into a single location; * Various renamings.	2020-12-17 09:53:22 -08:00

1 2 3 4 5 ...

276 Commits