boulder

Commit Graph

Author	SHA1	Message	Date
James Renken	61d2558b29	bad-key-revoker: Fix log message formatting (#8252 ) Fixes #8251	2025-06-16 11:30:14 -07:00
Aaron Gable	44f75d6abd	Remove mail functionality from bad-key-revoker (#8229 ) Simplify the main logic loop to simply revoke certs as soon as they're identified, rather than jumping through hoops to identify and deduplicate the associated accounts and emails. Make the Mailer portion of the config optional for deployability. Part of https://github.com/letsencrypt/boulder/issues/8199	2025-06-09 14:36:19 -07:00
Aaron Gable	ded2e5e610	Remove logging of contact email addresses (#7833 ) Fixes https://github.com/letsencrypt/boulder/issues/7801	2024-11-25 13:33:56 -08:00
Jacob Hoffman-Andrews	3baea4356f	Revert "sa: truncate all timestamps to seconds (#7519 )" (#7559 ) This reverts commit `2b5b6239a4`. Following up on #7556, after we made a more systematic change to use borp's TypeConverter, we no longer need to manually truncate timestamps.	2024-06-26 17:25:05 -07:00
Jacob Hoffman-Andrews	2b5b6239a4	sa: truncate all timestamps to seconds (#7519 ) As described in #7075, go-sql-driver/mysql v1.5.0 truncates timestamps to microseconds, while v1.6.0 and above does not. That means upon upgrading to v1.6.0, timestamps are written to the database with a resolution of nanoseconds, and SELECT statements also use a resolution of nanoseconds. We believe this is the cause of performance problems we observed when upgrading to v1.6.0 and above. To fix that, apply rounding in the application code. Rather than just rounding to microseconds, round to seconds since that is the resolution we care about. Using seconds rather than microseconds may also allow some of our indexes to grow more slowly over time. Note: this omits truncating some timestamps in CRL shard calculations, since truncating those resulted in test failures that I'll follow up on separately.	2024-06-12 15:00:24 -07:00
Matthew McPherrin	cb5384dcd7	Add --addr and/or --debug-addr flags to all commands (#7175 ) Many services already have --addr and/or --debug-addr flags. However, it wasn't universal, so this PR adds flags to commands where they're not currently present. This makes it easier to use a shared config file but listen on different ports, for running multiple instances on a single host. The config options are made optional as well, and removed from config-next/.	2023-12-07 17:41:01 -08:00
Phil Porada	4bd90ea82f	Log version string for more tools at startup (#7087 ) This is a followup to https://github.com/letsencrypt/boulder/pull/7086	2023-09-19 12:46:55 -04:00
Jacob Hoffman-Andrews	7d66d67054	It's borpin' time! (#6982 ) This change replaces [gorp] with [borp]. The changes consist of a mass renaming of the import and comments / doc fixups, plus modifications of many call sites to provide a context.Context everywhere, since gorp newly requires this (this was one of the motivating factors for the borp fork). This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and `github.com/letsencrypt/boulder/db.Transaction` to not embed their underlying gorp/borp objects, but to have them as plain fields. This ensures that we can only call methods on them that are specifically implemented in `github.com/letsencrypt/boulder/db`, so we don't miss wrapping any. This required introducing a `NewWrappedMap` method along with accessors `SQLDb()` and `BorpDB()` to get at the internal fields during metrics and logging setup. Fixes #6944	2023-07-17 14:38:29 -07:00
Phil Porada	17fb1b287f	cmd: Export prometheus metrics for TLS cert notBefore and notAfter fields (#6836 ) Export new prometheus metrics for the `notBefore` and `notAfter` fields to track internal certificate validity periods when calling the `Load()` method for a `*tls.Config`. Each metric is labeled with the `serial` field. ``` tlsconfig_notafter_seconds{serial="2152072875247971686"} 1.664821961e+09 tlsconfig_notbefore_seconds{serial="2152072875247971686"} 1.664821960e+09 ``` Fixes https://github.com/letsencrypt/boulder/issues/6829	2023-04-24 16:28:05 -04:00
Matthew McPherrin	0060e695b5	Introduce OpenTelemetry Tracing (#6750 ) Add a new shared config stanza which all boulder components can use to configure their Open Telemetry tracing. This allows components to specify where their traces should be sent, what their sampling ratio should be, and whether or not they should respect their parent's sampling decisions (so that web front-ends can ignore sampling info coming from outside our infrastructure). It's likely we'll need to evolve this configuration over time, but this is a good starting point. Add basic Open Telemetry setup to our existing cmd.StatsAndLogging helper, so that it gets initialized at the same time as our other observability helpers. This sets certain default fields on all traces/spans generated by the service. Currently these include the service name, the service version, and information about the telemetry SDK itself. In the future we'll likely augment this with information about the host and process. Finally, add instrumentation for the HTTP servers and grpc clients/servers. This gives us a starting point of being able to monitor Boulder, but is fairly minimal as this PR is already somewhat unwieldy: It's really only enough to understand that everything is wired up properly in the configuration. In subsequent work we'll enhance those spans with more data, and add more spans for things not automatically traced here. Fixes https://github.com/letsencrypt/boulder/issues/6361 --------- Co-authored-by: Aaron Gable <aaron@aarongable.com>	2023-04-21 10:46:59 -07:00
Matthew McPherrin	49851d7afd	Remove Beeline configuration (#6765 ) In a previous PR, #6733, this configuration was marked deprecated pending removal. Here is that removal.	2023-03-23 16:58:36 -04:00
Samantha	b2224eb4bc	config: Add validation tags to all configuration structs (#6674 ) - Require `letsencrypt/validator` package. - Add a framework for registering configuration structs and any custom validators for each Boulder component at `init()` time. - Add a `validate` subcommand which allows you to pass a `-component` name and `-config` file path. - Expose validation via exported utility functions `cmd.LookupConfigValidator()`, `cmd.ValidateJSONConfig()` and `cmd.ValidateYAMLConfig()`. - Add unit test which validates all registered component configuration structs against test configuration files. Part of #6052	2023-03-21 14:08:03 -04:00
Matthew McPherrin	e1ed1a2ac2	Remove beeline tracing (#6733 ) Remove tracing using Beeline from Boulder. The only remnant left behind is the deprecated configuration, to ensure deployability. We had previously planned to swap in OpenTelemetry in a single PR, but that adds significant churn in a single change, so we're doing this as multiple steps that will each be significantly easier to reason about and review. Part of #6361	2023-03-14 15:14:27 -07:00
Jacob Hoffman-Andrews	e052e7445b	admin-revoker: document malformed-revoke (#6714 ) In particular, document that it does not purge the Akamai cache. Also, in the RA, avoid creating a "fake" certificate object containing only the serial. Instead, use req.Serial directly in most places. This uncovered some incorrect logic. Fix that logic by gating the operations that actually need a full *x509.Certificate: revoking by key, and purging the Akamai cache. Also, make `req.Serial` mandatory for AdministrativelyRevokeCertificate. This is a reopen of #6693, which accidentally got merged into a different feature branch.	2023-03-02 12:02:21 -08:00
Matthew McPherrin	391a59921b	Move cmd.ConfigDuration to config.Duration (#6705 ) We rely on the ratelimit/ package in CI to validate our ratelimit configurations. However, because that package relies on cmd/ just for cmd.ConfigDuration, many additional dependencies get pulled in. This refactors just that struct to a separate config package. This was done using Goland's automatic refactoring tooling, which also organized a few imports while it was touching them, keeping standard library, internal and external dependencies grouped.	2023-02-28 08:11:49 -08:00
Jacob Hoffman-Andrews	00e988b557	bad-key-revoker: don't error out on multiple rows (#6531 ) Fixes #6520. Note: The unittest for this would be fairly simple - add a duplicate entry in the certificateStatus table. But we can't do that because our dev environment currently has a UNIQUE KEY on that table. So adding a unittest update for this is blocked on #6519.	2022-12-05 11:30:07 -08:00
Aaron Gable	0a02cdf7e3	Streamline gRPC client creation (#6472 ) Remove the need for clients to explicitly call bgrpc.NewClientMetrics, by moving that call inside bgrpc.ClientSetup. In case ClientSetup is called multiple times, use the recommended method to gracefully recover from registering duplicate metrics. This makes gRPC client setup much more similar to gRPC server setup after the previous server refactoring change landed.	2022-10-28 08:45:52 -07:00
Aaron Gable	9c197e1f43	Use io and os instead of deprecated ioutil (#6286 ) The iotuil package has been deprecated since go1.16; the various functions it provided now exist in the os and io packages. Replace all instances of ioutil with either io or os, as appropriate.	2022-08-10 13:30:17 -07:00
Samantha	b825594fa4	bad-key-revoker: Report unprocessed keys on each invocation (#6197 ) Add a new query to bad-key-revoker, which counts the number of unprocessed/queued keys on each run. This gives us a metric by which we can see if the bad-key-revoker is backed up or running behind. Fixes #6063	2022-07-01 09:38:06 -07:00
Jacob Hoffman-Andrews	76f987a1df	Reland "Allow expiration mailer to work in parallel" (#6133 ) This reverts commit `7ef6913e71`. We turned on the `ExpirationMailerDontLookTwice` feature flag in prod, and it's working fine but not clearing the backlog. Since https://github.com/letsencrypt/boulder/pull/6100 fixed the issue that caused us to (nearly) stop sending mail when we deployed #6057, this should be safe to roll forward. The revert of the revert applied cleanly, except for expiration-mailer/main.go and `main_test.go`, particularly around the contents `processCerts` (where `sendToOneRegID` was extracted from) and `sendToOneRegID` itself. So those areas are good targets for extra attention.	2022-05-23 16:16:43 -07:00
Aaron Gable	7ef6913e71	Revert "Allow expiration mailer to work in parallel" (#6080 ) When deployed, the newly-parallel expiration-mailer encountered unexpected difficulties and dropped to apparently sending nearly zero emails despite not throwing any real errors. Reverting the parallelism change until we understand and can fix the root cause. This reverts two commits: - Allow expiration mailer to work in parallel (#6057) - Fix data race in expiration-mailer test mocks (#6072) It also modifies the revert to leave the new `ParallelSends` config key in place (albeit completely ignored), so that the binary containing this revert can be safely deployed regardless of config status. Part of #5682	2022-05-03 13:18:40 -07:00
Jacob Hoffman-Andrews	9629c88d66	Allow expiration mailer to work in parallel (#6057 ) Previously, each accounts email would be sent in serial, along with several reads from the database (to check for certificate renewal) and several writes to the database (to update `certificateStatus.lastExpirationNagSent`). This adds a config field for the expiration mailer that sets the parallelism it will use. That means making and using multiple SMTP connections as well. Previously, `bmail.Mailer` was not safe for concurrent use. It also had a piece of API awkwardness: after you created a Mailer, you had to call Connect on it to change its state. Instead of treating that as a state change on Mailer, I split out a separate component: `bmail.Conn`. Now, when you call `Mailer.Connect()`, you get a Conn. You can send mail on that Conn and Close it when you're done. A single Mailer instance can produce multiple Conns, so Mailer is now concurrency-safe (while Conn is not). This involved a moderate amount of renaming and code movement, and GitHub's move detector is not keeping up 100%, so an eye towards "is this moved code?" may help. Also adding `?w=1` to the diff URL to ignore whitespace diffs.	2022-04-21 18:04:55 -07:00
Samantha	f69b57e0e1	Make DB client initialization uniform and stop setting 'READ-UNCOMMITTED' (#5741 ) Boulder components initialize their gorp and gorp-less (non-wrapped) database clients via two new SA helpers. These helpers handle client construction, database metric initialization, and (for gorp only) debug logging setup. Removes transaction isolation parameter `'READ-UNCOMMITTED'` from all database connections. Fixes #5715 Fixes #5889	2022-01-31 13:34:23 -08:00
Jacob Hoffman-Andrews	3bf06bb4d8	Export the config structs from our main files (#5875 ) This allows our documentation on those structs to show up in our godoc output.	2022-01-12 12:20:27 -08:00
Jacob Hoffman-Andrews	23dd1e21f9	Build all boulder binaries into a single binary (#5693 ) The resulting `boulder` binary can be invoked by different names to trigger the behavior of the relevant subcommand. For instance, symlinking and invoking as `boulder-ca` acts as the CA. Symlinking and invoking as `boulder-va` acts as the VA. This reduces the .deb file size from about 200MB to about 20MB. This works by creating a registry that maps subcommand names to `main` functions. Each subcommand registers itself in an `init()` function. The monolithic `boulder` binary then checks what name it was invoked with (`os.Args[0]`), looks it up in the registry, and invokes the appropriate `main`. To avoid conflicts, all of the old `package main` are replaced with `package notmain`. To get the list of registered subcommands, run `boulder --list`. This is used when symlinking all the variants into place, to ensure the set of symlinked names matches the entries in the registry. Fixes #5692	2021-10-20 17:05:45 -07:00
Samantha	99502b1ffb	oscp-updater: use rows.Scan() to get query results (#5656 ) - Replace `gorp.DbMap` with calls that use `sql.DB` directly - Use `rows.Scan()` and `rows.Next()` to get query results (which opens the door to streaming the results) - Export function `CertStatusMetadataFields` from `SA` - Add new function `ScanCertStatusRow` to `SA` - Add new function `NewDbSettingsFromDBConfig` to `SA` Fixes #5642 Part Of #5715	2021-10-18 10:33:09 -07:00
Andrew Gabbitas	17f300387b	BadKeyRevoker: backoff on errors or no work (#5580 ) - Add exponential backoff - Add key `backoffIntervalMax` to JSON config with a default of `60s` Fixes #5559	2021-08-19 13:31:47 -07:00
James Renken	5b37639109	Add certNotAfter condition to initial bad-key-revoker query (#5556 ) Check the `certNotAfter` column earlier in `bad-key-revoker`'s work, to avoid unnecessary queries to `certificateStatus` and `precertificates` about certificates we know are expired. Update `bad-key-revoker` tests to set unexpired certificates to have a future expiration time, and to use a fake clock for better hermeticity. Part of #5548	2021-08-02 16:01:42 -07:00
J.C. Jones	7b31bdb30a	Add read-only dbConns to SQLStorageAuthority and OCSPUpdater (#5555 ) This changeset adds a second DB connect string for the SA for use in read-only queries that are not themselves dependencies for read-write queries. In other words, this is attempting to only catch things like rate-limit `SELECT`s and other coarse-counting, so we can potentially move those read queries off the read-write primary database. It also adds a second DB connect string to the OCSP Updater. This is a little trickier, as the subsequent `UPDATE`s _are_ dependent on the output of the `SELECT`, but in this case it's operating on data batches, and a few seconds' replication latency are several orders of magnitude below the threshold for update frequency, so any certificates that aren't caught on run `n` can be caught on run `n+1`. Since we export DB metrics to Prometheus, this also refactors `InitDBMetrics` to take a DB Address (host:port tuple) and User out of the DB connection DSN and include those as labels in the metrics. Fixes #5550 Fixes #4985	2021-08-02 11:21:34 -07:00
Aaron Gable	8be32d3312	Use google.protobuf.Empty instead of core.Empty (#5454 ) Replace `core.Empty` with `google.protobuf.Empty` in all of our gRPC methods which consume or return an empty protobuf. The golang core proto libraries provide an empty message type, so there is no need for us to reinvent the wheel. This change is backwards-compatible and does not require a special deploy. The protobuf message descriptions of `core.Empty` and `google.protobuf.Empty` are identical, so their wire-formats are indistinguishable and therefore interoperable / cross-compatible. Fixes #5443	2021-06-03 14:17:41 -07:00
Aaron Gable	9abb39d4d6	Honeycomb integration proof-of-concept (#5408 ) Add Honeycomb tracing to all Boulder components which act as HTTP servers, gRPC servers, or gRPC clients. Add many values which we currently emit to logs to the trace spans. Add a way to configure the Honeycomb integration to our config files, and by default configure all of our tests to "mute" (send nothing). Followup changes will refine the configuration, attempt to reduce the new dependency load, and introduce better sampling. Part of https://github.com/letsencrypt/dev-misc-tickets/issues/218	2021-05-24 16:13:08 -07:00
Samantha	5a92926b0c	Remove dbconfig migration deployability code (#5348 ) Default boulder code paths to exclusively use the `db` config key Fixes #5338	2021-03-18 16:41:15 -07:00
Samantha	e2e7dad034	Move cmd.DBConfig fields to their own named sub-struct (#5286 ) Named field `DB`, in a each component configuration struct, acts as the receiver for the value of `db` when component JSON files are unmarshalled. When `cmd.DBConfig` fields are received at the root of component configuration struct instead of `DB` copy them to the `DB` field of the component configuration struct. Move existing `cmd.DBConfig` values from the root of each component's JSON configuration in `test/config-next` to `db` Part of #5275	2021-02-16 10:48:58 -08:00
Samantha	7cb0038498	Deprecate MaxDBConns for MaxOpenConns (#5274 ) In #5235 we replaced MaxDBConns in favor of MaxOpenConns. One week ago MaxDBConns was removed from all dev, staging, and production configurations. This change completes the removal of MaxDBConns from all components and test/config. Fixes #5249	2021-02-08 12:00:01 -08:00
Samantha	e0510056cc	Enhancements to SQL driver tuning via JSON config (#5235 ) Historically the only database/sql driver setting exposed via JSON config was maxDBConns. This change adds support for maxIdleConns, connMaxLifetime, connMaxIdleTime, and renames maxDBConns to maxOpenConns. The addition of these settings will give our SRE team a convenient method for tuning the reuse/closure of database connections. A new struct, DBSettings, has been added to SA. The struct, and each of it's fields has been commented. All new fields have been plumbed through to the relevant Boulder components and exported as Prometheus metrics. Tests have been added/modified to ensure that the fields are being set. There should be no loss in coverage Deployability concerns for the migration from maxDBConns to maxOpenConns have been addressed with the temporary addition of the helper method cmd.DBConfig.GetMaxOpenConns(). This method can be removed once test/config is defaulted to using maxOpenConns. Relevant sections of the code have TODOs added that link back to an newly opened issue. Fixes #5199	2021-01-25 15:34:55 -08:00
Aaron Gable	4d72f1f60e	RA: Update RPC interface to proto3 (#5039 ) Updates the Registration Authority to use proto3 for its RPC methods. This turns out to be a fairly minimal change, as many of the RA's request and response messages are defined in core.proto, and are therefore still proto2. Fixes #4955	2020-08-24 13:00:41 -07:00
Jacob Hoffman-Andrews	be2b19efee	Improve bad-key-revoker log output. (#4924 ) Previously this was logging a map of emails to unrevokedCertificates. Since unrevokedCertificates includes DER, which is a []byte, this was getting printed as a series of decimal numbers, one for each byte. This adds a Stringer implementation for unrevokedCertificates that omits the DER. Fixes #4921.	2020-07-02 10:09:26 -07:00
Jacob Hoffman-Andrews	36c8fed4d9	Fix up NoRows handling in bad-key-revoker. (#4874 ) Join on the precertificates table to handle the case when a precertificate was issued but no certificate. Treat NoRows as a regular error. Use named constants to specify revoked/expired arguments to insertCert helper. Remove the config gate on the bad-key-revoker unittest.	2020-06-23 11:31:21 -07:00
Jacob Hoffman-Andrews	d1fa9f9db8	Add more logging to bad-key-revoker. (#4871 )	2020-06-15 16:24:44 -07:00
Roland Bracewell Shoemaker	356510aa54	cmd/bad-key-revoker: don't skip certificates where the account has no contacts (#4872 )	2020-06-15 10:33:28 -07:00
Roland Bracewell Shoemaker	c1fc30020e	Fix bug in how bad-key-revoker resolves contacts (#4833 ) admin-revoker uses a dummy registration ID (0) when adding rows to the blockedKeys table. resolveContacts in bad-key-revoker fails if it cannot lookup a registration. Don't bother adding the id to the list of ids to resolve, and add a catch for non-existent registration IDs to resolveContacts.	2020-06-01 14:32:02 -07:00
Roland Bracewell Shoemaker	63aa8acbeb	Fix bad-key-revoker emailing corner case (#4810 ) Fixes a corner case where we would still send emails to the original revokers contact address if they didn't have any extant certificates associated with the account that did the revoking.	2020-05-18 11:53:17 -07:00
Roland Bracewell Shoemaker	087e91934d	Fix bad-key-revoker select (#4806 ) Adds a missing LIMIT, and adds a test case that catches the previous problem.	2020-05-07 13:05:20 -07:00
Roland Bracewell Shoemaker	97390560a3	Handful of revocation pkg cleanups (#4801 ) When we originally added this package (4 years ago) x/crypto/ocsp didn't have its own list of revocation reasons, so we added our own. Now it does have its own list, so just use that list instead of duplicating code for no real reason. Also we build a list of the revocation reasons we support so that we can tell users when they try to use an unsupported one. Instead of building this string every time, just build it once it during package initialization. Finally return the same error message in wfe that we use in wfe2 when a user requests an unsupported reason.	2020-04-30 17:29:42 -07:00
Roland Bracewell Shoemaker	70ff4d9347	Add bad-key-revoker daemon (#4788 ) Adds a daemon which monitors the new blockedKeys table and checks for any unexpired, unrevoked certificates that are associated with the added SPKI hashes and revokes them, notifying the user that issued the certificates. Fixes #4772.	2020-04-23 11:51:59 -07:00

45 Commits