Commit Graph

800 Commits

Author SHA1 Message Date
Aaron Gable a61aadc9e4
Move revokedCertificates and replacementOrders into db (#7639)
These database schema changes were applied in prod in IN-9706 and
IN-10081.
2024-08-01 10:01:59 -07:00
Samantha Frank c13591ab82
SFE: Call RA.UnpauseAccount and handle result (#7638)
Call `RA.UnpauseAccount` for valid unpause form submissions.

Determine and display the appropriate outcome to the Subscriber based on
the count returned by `RA.UnpauseAccount`:
- If the count is zero, display the "Account already unpaused" message.
- If the count equals the max number of identifiers allowed in a single
request, display a page explaining the need to visit the unpause URL
again.
- Otherwise, display the "Successfully unpaused all N identifiers"
message.

Apply per-request timeout from the SFE configuration.

Part of https://github.com/letsencrypt/boulder/issues/7406
2024-07-31 14:46:46 -04:00
Aaron Gable 98a4bc01ea
Rename 'now' to 'validUntil' in GetAuthz requests (#7631)
The name "now" was always misleading, because we never set the value to
be the actual current time, we always set it to be some time in the
future to avoid returning authzs which expire in the very near future.
Changing the name to "validUntil" matches the current naming in
GetPendingAuthorizationRequest.
2024-07-25 10:52:34 -07:00
Samantha Frank 15ad9fc5ab
sa: Provide a grace period for recently unpaused identifiers (#7573)
SA method PauseIdentifiers skips identifiers unpaused within the last 2
weeks, providing a grace period for operators to fix configuration
issues resulting in numerous contiguous validation failures.

Part of #7475
2024-07-11 12:11:27 -04:00
Samantha Frank 63452d5afe
sa: Avoid database timeouts in UnpauseAccount (#7572)
SA method UnpauseAccount uses up to 5 `UPDATE` query iterations, each
with a `LIMIT` of 10000, to unpause up to 50000 identifiers and returns
a count of identifiers unpaused.

Part of #7475
2024-07-10 10:41:51 -04:00
Jacob Hoffman-Andrews 926a0704b4
sa: Truncate notBefore times on issuedNames (#7568)
We only care about the date of an issuedName, not the exact time, and
this may reduce the size of the index somewhat.

---------

Co-authored-by: Samantha Frank <hello@entropy.cat>
2024-07-03 11:11:59 -04:00
Jacob Hoffman-Andrews 3baea4356f
Revert "sa: truncate all timestamps to seconds (#7519)" (#7559)
This reverts commit 2b5b6239a4.

Following up on #7556, after we made a more systematic change to use
borp's TypeConverter, we no longer need to manually truncate timestamps.
2024-06-26 17:25:05 -07:00
Jacob Hoffman-Andrews c0ffbac7a8
sa: truncate times in type converter (#7556)
We believe the MariaDB query planner generates inefficient query plans
when a time index is queried using high precision (nanosecond) times.
This uses the updated borp from[1] to automatically truncate
`time.Time` and `*time.Time` in query parameters.

[1]: https://github.com/letsencrypt/borp/pull/11

Part of #5437
2024-06-24 11:26:34 -07:00
Samantha Frank de9c06129b
SA: Last round of comments for PR #7490 (#7551)
Fixes #7550
Part of #7406
Part of #7475
2024-06-20 12:56:39 -04:00
Samantha 594cb1332f
SA: Implement schema and methods for (account, hostname) pausing (#7490)
Add the storage implementation for our new (account, hostname) pair
pausing feature.

- Add schema and model for for the new paused table
- Add SA service methods for interacting with the paused table

Part of #7406
Part of #7475
2024-06-17 10:18:10 -04:00
Jacob Hoffman-Andrews 2b5b6239a4
sa: truncate all timestamps to seconds (#7519)
As described in #7075, go-sql-driver/mysql v1.5.0 truncates timestamps
to microseconds, while v1.6.0 and above does not. That means upon
upgrading to v1.6.0, timestamps are written to the database with a
resolution of nanoseconds, and SELECT statements also use a resolution
of nanoseconds. We believe this is the cause of performance problems
we observed when upgrading to v1.6.0 and above.

To fix that, apply rounding in the application code. Rather than
just rounding to microseconds, round to seconds since that is the
resolution we care about.  Using seconds rather than microseconds
may also allow some of our indexes to grow more slowly over time.

Note: this omits truncating some timestamps in CRL shard calculations,
since truncating those resulted in test failures that I'll follow up
on separately.
2024-06-12 15:00:24 -07:00
Aaron Gable b92581d620
Better compile-time type checking for gRPC server implementations (#7504)
Replaced our embeds of foopb.UnimplementedFooServer with
foopb.UnsafeFooServer. Per the grpc-go docs this reduces the "forwards
compatibility" of our implementations, but that is only a concern for
codebases that are implementing gRPC interfaces maintained by third
parties, and which want to be able to update those third-party
dependencies without updating their own implementations in lockstep.
Because we update our protos and our implementations simultaneously, we
can remove this safety net to replace runtime type checking with
compile-time type checking.

However, that replacement is not enough, because we never pass our
implementation objects to a function which asserts that they match a
specific interface. So this PR also replaces our reflect-based unittests
with idiomatic interface assertions. I do not view this as a perfect
solution, as it relies on people implementing new gRPC servers to add
this line, but it is no worse than the status quo which relied on people
adding the "TestImplementation" test.

Fixes https://github.com/letsencrypt/boulder/issues/7497
2024-05-28 09:26:29 -07:00
Aaron Gable 89213f9214
Use generic types for gRPC stream implementations (#7501)
Update the version of protoc-gen-go-grpc that we use to generate Go gRPC
code from our proto files, and update the versions of other gRPC tools
and libraries that we use to match. Turn on the new
`use_generic_streams` code generation flag to change how
protoc-gen-go-grpc generates implementations of our streaming methods,
from creating a wholly independent implementation for every stream to
using shared generic implementations.

Take advantage of this code-sharing to remove our SA "wrapper" methods,
now that they have truly the same signature as the SARO methods which
they wrap. Also remove all references to the old-style stream names
(e.g. foopb.FooService_BarMethodClient) and replace them with the new
underlying generic names, for the sake of consistency. Finally, also
remove a few custom stream test mocks, replacing them with the generic
mocks.ServerStreamClient.

Note that this PR does not change the names in //mocks/sa.go, to avoid
conflicts with work happening in the pursuit of
https://github.com/letsencrypt/boulder/issues/7476. Note also that this
PR updates the version of protoc-gen-go-grpc that we use to a specific
commit. This is because, although a new release of grpc-go itself has
been cut, the codegen binary is a separate Go module with its own
releases, and it hasn't had a new release cut yet. Tracking for that is
in https://github.com/grpc/grpc-go/issues/7030.
2024-05-24 13:54:25 -07:00
Jacob Hoffman-Andrews e75a821cc9
sa: eliminate requestedNames table (#7471)
Part of https://github.com/letsencrypt/boulder/issues/7432

Follows up on https://github.com/letsencrypt/boulder/pull/7435, now that
that PR is deployed.
2024-05-06 09:15:51 -07:00
Aaron Gable 6063430aed
admin: fail if any error is encountered during parallel processing (#7466)
While we don't want to halt the admin tool in the midst of its parallel
processing, we can keep track of whether it has encountered any errors
and raise one meta-error at the end of its execution. This will prevent
the top-level admin code from claiming that execution succeeded, and
ensure operators notice any previously-logged errors.

As part of this, fix the SA's GetLintPrecertificate wrapper to actually
call the SARO's GetLintPrecertificate, instead of incorrectly calling
the SARO's GetCertificate.

Fixes https://github.com/letsencrypt/boulder/issues/7460
2024-05-01 13:57:32 -07:00
Aaron Gable db77952e87
RA: fix GetSerialsByKey and GetSerialsByAccount (#7465)
Correctly explode the params slice with Go's "..." notation so that
gorp/go-sql-driver correctly receives each element of the params slice,
rather than receiving the slice as a whole. Also use the SA's clock,
rather than the DB's, to control which certs are selected -- in
deployments this wouldn't make a difference but in test those clocks can
be very different.

Add two unit tests to ensure this query does not regress, and create a
generic fake gRPC server stream for use in several SA tests including
the new ones.

Fixes https://github.com/letsencrypt/boulder/issues/7460
2024-05-01 13:57:25 -07:00
Aaron Gable e05d47a10a
Replace explicit int loops with range-over-int (#7434)
This adopts modern Go syntax to reduce the chance of off-by-one errors
and remove unnecessary loop variable declarations.

Fixes https://github.com/letsencrypt/boulder/issues/7227
2024-04-22 10:34:51 -07:00
Jacob Hoffman-Andrews 89b07f4543
sa: get order names from authorizations (#7435)
This removes the only place we query the requestedNames table, which
allows us to get rid of it in a subsequent PR (once this one is merged
and deployed).

Part of https://github.com/letsencrypt/boulder/issues/7432
2024-04-18 14:00:53 -07:00
Aaron Gable 13172ac3f1
Fix cert-checker in config-next integration tests (#7440)
Cert checker has been failing in config-next because the
"CertCheckerRequiresCorrespondence" flag is enabled, but no permissions
were granted on the precertificates table. While this did cause it to
log profusively, this did not cause the tests to fail because
integration-test.py simply executes cert-checker without caring about
its exit code.
2024-04-18 09:40:41 -07:00
Aaron Gable 5c97f994bb
SA: Remove unused PreviousCertificateExists method (#7439)
This method has had no callers since the removal of ACMEv1.
2024-04-18 12:29:35 -04:00
Aaron Gable a7b73450d5
Re-enable lints on go1.22 (#7412)
We had disabled our lints on go1.22 because golangci-lint and
staticcheck didn't work with some of its updates. Re-enable them, and
fix the things which the updated linters catch now.

Fixes https://github.com/letsencrypt/boulder/issues/7229
2024-04-04 08:14:29 -07:00
Phil Porada 5f616ccdb9
Upgrade go-jose from v2.6.1 to v.4.0.1 (#7345)
Upgrade from the old go-jose v2.6.1 to the newly minted go-jose v4.0.1. 
Cleans up old code now that `jose.ParseSigned` can take a list of
supported signature algorithms.

Fixes https://github.com/letsencrypt/boulder/issues/7390

---------

Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
2024-04-02 17:49:51 -04:00
Phil Porada 8556eaedca
SA: store and return certificate profile name (#7352)
Adds `certificateProfileName` to the `orders` database table. The
[maximum
length](https://github.com/letsencrypt/boulder/pull/7325/files#diff-a64a0af7cbf484da8e6d08d3eefdeef9314c5d9888233f0adcecd21b800102acR35)
of a profile name matches the `//issuance` package.

Adds a `MultipleCertificateProfiles` feature flag that, when enabled,
will store the certificate profile name from a `NewOrderRequest`. The
certificate profile name is allowed to be empty and the database will
treat that row as [NULL](https://mariadb.com/kb/en/null-values/). When
the SA retrieves this potentially NULL row, it will be cast as the
golang string zero value `""`.

SRE ticket IN-10145 has been filed to perform the database migration and
enable the new feature flag. The migration must be performed before
enabling the feature flag.

Part of https://github.com/letsencrypt/boulder/issues/7324
2024-03-20 13:08:31 -04:00
Aaron Gable 7f04092e72
Simplify streaming rows from the database (#7372)
Create a new method on the gorm rows object which runs a small closure
for every row retrieved from the database. Use this new method to remove
20 lines of boilerplate from five different SA methods and rocsp-tool.
2024-03-19 08:39:00 -07:00
Samantha 5e68cbe552
WFE: Gate ARI limit exemption and replacement tracking on a feature flag (#7383)
Gate checking of replacement orders and exemption for ARI replacements
on the `TrackReplacementCertificatesARI` feature flag.
2024-03-18 12:22:01 -04:00
Aaron Gable 8d169a8dfb
Add certificateProfileName to RA, SA, and Core order protos (#7381)
This adds the profile name to the proto messages necessary to propagate
it from the WFE to the SA, and from the SA to the CA. This change is
safe to land prior to any logic being added, and unblocks
profile-handling logic changes to the WFE, RA, SA, and CA.

Part of https://github.com/letsencrypt/boulder/issues/7309
2024-03-14 13:46:58 -04:00
Aaron Gable 6710ebe4cd
admin: use SA to get serials by account and by SPKI hash (#7369)
Add two new methods to the SA, GetSerialsByKey and GetSerialsByAccount,
which use the same query as the admin tool has previously used to get
serials matching a given SPKI hash or a given registration ID. These two
new gRPC methods read the database row-by-row and produce streams of
results to keep SA memory usage low.

Use these methods in the admin tool so it no longer needs a direct
database connection for these actions.

Part of https://github.com/letsencrypt/boulder/issues/7350
2024-03-11 13:25:59 -07:00
Samantha f10abd27eb
SA/ARI: Add method of tracking certificate replacement (#7284)
Part of #6732
Part of #7038
2024-02-08 14:19:29 -05:00
Phil Porada 0e9f5d3545
va: Audit log which DNS resolver performs a lookup (#7271)
Adds the chosen DNS resolver to the VAs `ValidationRecord` object so
that for each challenge type during a validation, boulder can audit log
the resolver(s) chosen to fulfill the request..

Fixes https://github.com/letsencrypt/boulder/issues/7140
2024-02-05 14:26:39 -05:00
Aaron Gable c305acfd97
SA: Add GetLintPrecertificate gRPC method (#7274)
Add a new "GetLintPrecertificate" method to the SA's gRPC service. This
acts identically to the existing "GetCertificate", but returns the
linting precertificate created just prior to the actual precertificate
instead. This is useful for revocation, where we need to be able to act
on a serial even if the corresponding (pre)certificate was never issued
or never saved to the database.

Part of https://github.com/letsencrypt/boulder/issues/7135
2024-01-23 14:01:28 -08:00
Phil Porada 0fc9de63ee
SA: Enforce microsecond granularity for long_query_time and max_statement_time (#7224)
In MariaDB, `long_query_time`[1] and `max_statement_time`[2] have up to
microsecond granularity (6 digits to the right of the decimal).

Fixes an issue detected by proxysql in staging.
```
MySQL_Session.cpp:6567:handler___status_WAITING_CLIENT_DATA___STATE_SLEEP___MYSQL_COM_QUERY_qpo(): [ERROR] Unable to parse query. If correct, report it as a bug: SET long_query_time=3.9200000000000004
```

1. https://mariadb.com/kb/en/server-system-variables/#long_query_time
2. https://mariadb.com/kb/en/server-system-variables/#max_statement_time

---------

Co-authored-by: Aaron Gable <aaron@letsencrypt.org>
2023-12-21 13:06:30 -05:00
Aaron Gable 164e035915
Reduce logging from inflight validation collisions (#7209)
If a client attempts to validate a challenge twice in rapid succession,
we'll kick off two background validation routines. One of these will
complete first, updating the database with success or failure. The other
will fail when it attempts to update the database and finds that there
are no longer any authorizations with that ID in the "pending" state.
Reduce the level at which we log such events, since we don't
particularly care about them.

Fixes https://github.com/letsencrypt/boulder/issues/3995
2023-12-15 09:58:34 -08:00
Aaron Gable 21b18667b2
Remove static test certs from SA unittests (#7217)
Fixes https://github.com/letsencrypt/boulder/issues/6279
2023-12-15 07:36:59 -08:00
Phil Porada 51e9f39259
Finish migration from int64 durations to durationpb (#7147)
This is a cleanup PR finishing the migration from int64 durations to
protobuf `*durationpb.Duration` by removing all usage of the old int64
fields. In the previous PR
https://github.com/letsencrypt/boulder/pull/7146 all fields were
switched to read from the protobuf durationpb fields.

Fixes https://github.com/letsencrypt/boulder/issues/7097
2023-11-28 12:51:11 -05:00
Phil Porada 6925fad324
Finish migration from int64 timestamps to timestamppb (#7142)
This is a cleanup PR finishing the migration from int64 timestamps to
protobuf `*timestamppb.Timestamps` by removing all usage of the old
int64 fields. In the previous PR
https://github.com/letsencrypt/boulder/pull/7121 all fields were
switched to read from the protobuf timestamppb fields.

Adds a new case to `core.IsAnyNilOrZero` to check various properties of
a `*timestamppb.Timestamp` reducing the visual complexity for receivers.

Fixes https://github.com/letsencrypt/boulder/issues/7060
2023-11-27 13:37:31 -08:00
Phil Porada 279a4d539d
Read from durationpb instead of int64 durations (#7146)
Switch to reading grpc duration values from the new durationpb protofbuf
fields, completely ignoring the old int64 fields.

Part 2 of 3 for https://github.com/letsencrypt/boulder/issues/7097
2023-11-13 12:23:46 -05:00
Aaron Gable f24ec910ef
Further simplifications to test.ThrowAwayCert (#7129)
Remove ThrowAwayCert's nameCount argument, since it is always set to 1
by all callers. Remove ThrowAwayCertWithSerial, because it has no
callers. Change the throwaway cert's key from RSA512 to ECDSA P-224 for
a two-orders-of-magnitude speedup in key generation. Use this simplified
form in two new places in the RA that were previously rolling their own
test certs.
2023-11-02 09:45:56 -07:00
Aaron Gable 3a3e32514c
Give throwaway test certs reasonable validity intervals (#7128)
Add a new clock argument to the test-only ThrowAwayCert function, and
use that clock to generate reasonable notBefore and notAfter timestamps
in the resulting throwaway test cert. This is necessary to easily test
functions which rely on the expiration timestamp of the certificate,
such as upcoming work about computing CRL shards.

Part of https://github.com/letsencrypt/boulder/issues/7094
2023-11-01 15:24:43 -07:00
Phil Porada b8b105453a
Rename protobuf duration fields to <fieldname>NS and populate new duration fields (#7115)
* Renames all of int64 as a time.Duration fields to `<fieldname>NS` to
indicate they are Unix nanoseconds.
* Adds new `google.protobuf.Duration` fields to each .proto file where
we previously had been using an int64 field to populate a time.Duration.
* Updates relevant gRPC messages.

Part 1 of 3 for https://github.com/letsencrypt/boulder/issues/7097
2023-10-26 10:46:03 -04:00
Phil Porada a5c2772004
Add and populate new protobuf Timestamp fields (#7070)
* Adds new `google.protobuf.Timestamp` fields to each .proto file where
we had been using `int64` fields as a timestamp.
* Updates relevant gRPC messages to populate the new
`google.protobuf.Timestamp` fields in addition to the old `int64`
timestamp fields.
* Added tests for each `<x>ToPB` and `PBto<x>` functions to ensure that
new fields passed into a gRPC message arrive as intended.
* Removed an unused error return from `PBToCert` and `PBToCertStatus`
and cleaned up each call site.

Built on-top of https://github.com/letsencrypt/boulder/pull/7069
Part 2 of 4 related to
https://github.com/letsencrypt/boulder/issues/7060
2023-10-11 12:12:12 -04:00
Aaron Gable bab048d221
SA: Add and use revokedCertificates table (#7095)
Add a new "revokedCertificates" table to the database schema. This table
is similar to the existing "certificateStatus" table in many ways, but
the idea is that it will only have rows added to it when certificates
are revoked, not when they're issued. Thus, it will grow many orders of
magnitude slower than the certificateStatus table does. Eventually, it
will replace that table entirely.

The one column that revokedCertificates adds is the new "ShardIdx"
column, which is the CRL shard in which the revoked certificate will
appear. This way we can assign certificates to CRL shards at the time
they are revoked, and guarantee that they will never move to a different
shard even if we change the number of shards we produce. This will
eventually allow us to put CRL URLs directly into our certificates,
replacing OCSP URLs.

Add new logic to the SA's RevokeCertificate and UpdateRevokedCertificate
methods to handle this new table. If these methods receive a request
which specifies a CRL shard (our CRL shards are 1-indexed, so shard 0
does not exist), then they will ensure that the new revocation status is
written into both the certificateStatus and revokedCertificates tables.
This logic will not function until the RA is updated to take advantage
of it, so it is not a risk for it to appear in Boulder before the new
table has been created.

Also add new logic to the SA's GetRevokedCertificates method. Similar to
the above, this reads from the new table if the ShardIdx field is
supplied in the request message. This code will not operate until the
crl-updater is updated to include this field. We will not perform this
update for a minimum of 100 days after this code is deployed, to ensure
that all unexpired revoked certificates are present in the
revokedCertificates table.

Part of https://github.com/letsencrypt/boulder/issues/7094
2023-10-02 10:21:14 -07:00
Phil Porada 034316ef6a
Rename int64 timestamp related protobuf fields to <fieldname>NS (#7069)
Rename all of int64 timestamp fields to `<fieldname>NS` to indicate they
are Unix nanosecond timestamps.

Part 1 of 4 related to
https://github.com/letsencrypt/boulder/issues/7060
2023-09-15 13:49:07 -04:00
Aaron Gable a70fc604a3
Use go1.21's stdlib slices package (#7074)
As of go1.21, there's a new standard library package which provides
basically the same (generic!) methods as the x/exp/slices package has
been. Now that we're on go1.21, let's use the more stable package.

Fixes https://github.com/letsencrypt/boulder/issues/6951
Fixes https://github.com/letsencrypt/boulder/issues/7032
2023-09-08 13:46:46 -07:00
Aaron Gable 7bed24a401
SA: Fix two bugs in UpdateCRLShard (#7052)
The NextUpdate field should not be required, as it is not necessary for
tracking and preventing duplicate work between multiple crl-updater
instances.

The ThisUpdate conditional needs explicit handling for NULL to ensure
that it updates correctly.
2023-08-31 12:06:33 -04:00
Aaron Gable 6a450a2272
Improve CRL shard leasing (#7030)
Simplify the index-picking logic in the SA's leaseOldestCrlShard method.
Specifically, more clearly separate it into "missing" and "non-missing"
cases, which require entirely different logic: picking a random missing
shard, or picking the oldest unleased shard, respectively.

Also change the UpdateCRLShard method to "unlease" shards when they're
updated. This allows the crl-updater to run as quickly as it likes,
while still ensuring that multiple instances do not step on each other's
toes.

The config change for shardWidth and lookbackPeriod instead of
certificateLifetime has been deployed in prod since IN-8445. The config
change changing the shardWidth is just so that the tests neither produce
a bazillion shards, nor have to do a bazillion SA queries for each chunk
within a shard, improving the readability of test logs.

Part of https://github.com/letsencrypt/boulder/issues/7023
2023-08-08 17:05:00 -07:00
Jacob Hoffman-Andrews 38fc840184
sa: refactor how metrics and logging are set up (#7031)
This eliminates the need for a pair of accessors on `db.WrappedMap` that
expose the underlying `*sql.DB` and `*borp.DbMap`.

Fixes #6991
2023-08-08 09:51:23 -07:00
Aaron Gable 9a4f0ca678
Deprecate LeaseCRLShards feature (#7009)
This feature flag is enabled in both staging and prod.
2023-08-07 15:17:00 -07:00
Jacob Hoffman-Andrews 725f190c01
ca: remove orphan queue code (#7025)
The `orphanQueueDir` config field is no longer used anywhere.

Fixes #6551
2023-08-02 16:04:28 -07:00
Samantha 055f620c4b
Initial implementation of key-value rate limits (#6947)
This design seeks to reduce read-pressure on our DB by moving rate limit
tabulation to a key-value datastore. This PR provides the following:

- (README.md) a short guide to the schemas, formats, and concepts
introduced in this PR
- (source.go) an interface for storing, retrieving, and resetting a
subscriber bucket
- (name.go) an enumeration of all defined rate limits
- (limit.go) a schema for defining default limits and per-subscriber
overrides
- (limiter.go) a high-level API for interacting with key-value rate
limits
- (gcra.go) an implementation of the Generic Cell Rate Algorithm, a
leaky bucket-style scheduling algorithm, used to calculate the present
or future capacity of a subscriber bucket using spend and refund
operations

Note: the included source implementation is test-only and currently
accomplished using a simple in-memory map protected by a mutex,
implementations using Redis and potentially other data stores will
follow.

Part of #5545
2023-07-21 12:57:18 -04:00
Aaron Gable 908421bb98
crl-updater: lease CRL shards to prevent races (#6941)
Add a new feature flag, LeaseCRLShards, which controls certain aspects
of crl-updater's behavior.

When this flag is enabled, crl-updater calls the new SA.LeaseCRLShard
method before beginning work on a shard. This prevents it from stepping
on the toes of another crl-updater instance which may be working on the
same shard. This is important to prevent two competing instances from
accidentally updating a CRL's Number (which is an integer representation
of its thisUpdate timestamp) *backwards*, which would be a compliance
violation.

When this flag is enabled, crl-updater also calls the new
SA.UpdateCRLShard method after finishing work on a shard.

In the future, additional work will be done to make crl-updater use the
"give me the oldest available shard" mode of the LeaseCRLShard method.

Fixes https://github.com/letsencrypt/boulder/issues/6897
2023-07-19 15:11:16 -07:00
Jacob Hoffman-Andrews 7d66d67054
It's borpin' time! (#6982)
This change replaces [gorp] with [borp].

The changes consist of a mass renaming of the import and comments / doc
fixups, plus modifications of many call sites to provide a
context.Context everywhere, since gorp newly requires this (this was one
of the motivating factors for the borp fork).

This also refactors `github.com/letsencrypt/boulder/db.WrappedMap` and
`github.com/letsencrypt/boulder/db.Transaction` to not embed their
underlying gorp/borp objects, but to have them as plain fields. This
ensures that we can only call methods on them that are specifically
implemented in `github.com/letsencrypt/boulder/db`, so we don't miss
wrapping any. This required introducing a `NewWrappedMap` method along
with accessors `SQLDb()` and `BorpDB()` to get at the internal fields
during metrics and logging setup.

Fixes #6944
2023-07-17 14:38:29 -07:00
Aaron Gable bd29cc430f
Allow reading incident rows with NULL columns (#6961)
Fixes https://github.com/letsencrypt/boulder/issues/6960
2023-06-30 08:29:16 -07:00
Aaron Gable 3d80d8505e
SA: gRPC methods for leasing CRL shards (#6940)
Add two new methods, LeaseCRLShard and UpdateCRLShard, to the SA gRPC
interface. These methods work in concert both to prevent multiple
instances of crl-updater from stepping on each others toes, and to lay
the groundwork for a less bursty version of crl-updater in the future.

Introduce a new database table, crlShards, which tracks the thisUpdate
and nextUpdate timestamps of each CRL shard for each issuer. It also has
a column "leasedUntil", which is also a timestamp. Grant the SA user
read-write access to this table.

LeaseCRLShard updates the leasedUntil column of the identified shard to
the given time. It returns an error if the identified shard's
leasedUntil timestamp is already in the future. This provides a
mechanism for crl-updater instances to "lick the cookie", so to speak,
marking CRL shards as "taken" so that multiple crl-updater instances
don't attempt to work on the same shard at the same time. Using a
timestamp has the added benefit that leases are guaranteed to expire,
ensuring that we don't accidentally fail to work on a shard forever.

LeaseCRLShard has a second mode of operation, when a range of potential
shards is given in the request, rather than a single shard. In this
mode, it returns the shard (within the given range) whose thisUpdate
timestamp is oldest. (Shards with no thisUpdate timestamp, including
because the requested range includes shard indices the database doesn't
yet know about, count as older than any shard with any thisUpdate
timestamp.) This allows crl-updater instances which don't care which
shard they're working on to do the most urgent work first.

UpdateCRLShard updates the thisUpdate and nextUpdate timestamps of the
identified shard. This closes the loop with the second mode of
LeaseCRLShard above: by updating the thisUpdate timestamp, the method
marks the shard as no longer urgently needing to be worked on.

IN-9220 tracks creating this table in staging and production
Part of #6897
2023-06-26 15:39:13 -07:00
Jacob Hoffman-Andrews 824417f6c0
sa: refactor db initialization (#6930)
Previously, we had three chained calls initializing a database:

 - InitWrappedDb calls NewDbMap
 - NewDbMap calls NewDbMapFromConfig

Since all three are exporetd, this left me wondering when to call one vs
the others.

It turns out that NewDbMap is only called from tests, so I renamed it to
DBMapForTest to make that clear.

NewDbMapFromConfig is only called internally to the SA, so I made it
unexported it as newDbMapFromMysqlConfig.

Also, I copied the ParseDSN call into InitWrappedDb, so it doesn't need
to call DBMapForTest. Now InitWrappedDb and DBMapForTest both
independently call newDbMapFromMysqlConfig.

I also noticed that InitDBMetrics was only called internally so I
unexported it.
2023-06-13 10:15:40 -07:00
Samantha 124c4cc6f5
grpc/sa: Implement deep health checks (#6928)
Add the necessary scaffolding for deep health checking of our various
gRPC components. Each component implementation that also implements the
grpc.checker interface will be checked periodically, and the health
status of the component will be updated accordingly.

Add the necessary methods to SA to implement the grpc.checker interface
and register these new health checks with Consul.

Additionally:
- Update entry point script to check for ProxySQL readiness.
- Increase the poll rate for gRPC Consul checks from 5s to 2s to help
with DNS failures, due to check failures, on startup.
- Change log level for Consul from INFO to ERROR to deal with noisy logs
full of transport failures due to Consul gRPC checks firing before the
SAs are up.

Fixes #6878
Part of #6795
2023-06-12 13:58:53 -04:00
Jacob Hoffman-Andrews 80e1510819
admin: add clear-email subcommand (#6919)
When a user wants their email address deleted from the database but no
longer has access to their account, this allows an administrator to
clear it.

This adds `admin` as an alias for `admin-revoker`, because we'd like the
clear-email sub-command to be a part of that overall tool, but it's not
really revocation related.

Part of #6864
2023-06-01 14:33:24 -04:00
Samantha e72a8f9cac
docker: Update proxysql container to match production (#6914) 2023-05-31 11:31:10 -04:00
Jacob Hoffman-Andrews b9eeb6ce1c
sa/database: move unmoored comment (#6922)
This comment about STRICT_ALL_TABLES got separated from the code it
documented. Bring them back together.
2023-05-30 09:15:06 -07:00
Phil Porada c75bf7033a
SA: Don't store HTTP-01 hostname and port in database validationrecord (#6863)
Removes the `Hostname` and `Port` fields from an http-01
ValidationRecord model prior to storing the record in the database.
Using `"hostname":"example.com","port":"80"` as a snippet of a whole
validation record, we'll save minimum 36 bytes for each new http-01
ValidationRecord that gets stored. When retrieving the record, the
ValidationRecord `RehydrateHostPort` method will repopulate the
`Hostname` and `Port` fields from the `URL` field.

Fixes the main goal of
https://github.com/letsencrypt/boulder/issues/5231.

---------

Co-authored-by: Samantha <hello@entropy.cat>
2023-05-23 15:36:17 -04:00
Aaron Gable 56f8537e68
Ensure SelectOne queries never return more than 1 row (#6900)
As a follow-up to https://github.com/letsencrypt/boulder/issues/5467, I
did an audit of all places where we call SelectOne to ensure that those
queries can never return more than one result. These four functions were
the only places that weren't already constrained to a single result
through the use of "SELECT COUNT", "LIMIT 1", "WHERE uniqueKey =", or
similar. Limit these functions' queries to always only return a single
result, now that their underlying tables no longer have unique key
constraints.

Additionally, slightly refactor selectRegistration to just take a single
column name rather than a whole WHERE clause.

Fixes https://github.com/letsencrypt/boulder/issues/6521
2023-05-17 14:13:21 -07:00
Matthew McPherrin 8c9c55609b
Remove redundant jose import alias (#6887)
This PR should have no functional change; just a cleanup.
2023-05-15 09:45:58 -07:00
Aaron Gable 1fcd951622
Probs: simplifications and cleanup (#6876)
Make minor, non-user-visible changes to how we structure the probs
package. Notably:
- Add new problem types for UnsupportedContact and
UnsupportedIdentifier, which are specified by RFC8555 and which we will
use in the future, but haven't been using historically.
- Sort the problem types and constructor functions to match the
(alphabetical) order given in RFC8555.
- Rename some of the constructor functions to better match their
underlying problem types (e.g. "TLSError" to just "TLS").
- Replace the redundant ProblemDetailsToStatusCode function with simply
always returning a 500 if we haven't properly set the problem's
HTTPStatus.
- Remove the ability to use either the V1 or V2 error namespace prefix;
always use the proper RFC namespace prefix.
2023-05-12 12:10:13 -04:00
Jacob Hoffman-Andrews 1c7e0fd1d8
Store linting certificate instead of precertificate (#6807)
In order to get rid of the orphan queue, we want to make sure that
before we sign a precertificate, we have enough data in the database
that we can fulfill our revocation-checking obligations even if storing
that precertificate in the database fails. That means:

- We should have a row in the certificateStatus table for the serial.
- But we should not serve "good" for that serial until we are positive
the precertificate was issued (BRs 4.9.10).
- We should have a record in the live DB of the proposed certificate's
public key, so the bad-key-revoker can mark it revoked.
- We should have a record in the live DB of the proposed certificate's
names, so it can be revoked if we are required to revoke based on names.

The SA.AddPrecertificate method already achieves these goals for
precertificates by writing to the various metadata tables. This PR
repurposes the SA.AddPrecertificate method to write "proposed
precertificates" instead.

We already create a linting certificate before the precertificate, and
that linting certificate is identical to the precertificate that will be
issued except for the private key used to sign it (and the AKID). So for
instance it contains the right pubkey and SANs, and the Issuer name is
the same as the Issuer name that will be used. So we'll use the linting
certificate as the "proposed precertificate" and store it to the DB,
along with appropriate metadata.

In the new code path, rather than writing "good" for the new
certificateStatus row, we write a new, fake OCSP status string "wait".
This will cause us to return internalServerError to OCSP requests for
that serial (but we won't get such requests because the serial has not
yet been published). After we finish precertificate issuance, we update
the status to "good" with SA.SetCertificateStatusReady.

Part of #6665
2023-04-26 13:54:24 -07:00
Aaron Gable 97aa50977f
Give orderToAuthz2 an auto-increment ID column (#6835)
Replace the current orderToAuthz2 table schema with one that includes an
auto-increment ID column, so that this table can be partitioned simply
by ID, like all of our other partitioned tables.

Update the SA so that when it selects from a join over this table and
the authz2 table, it explicitly selects the columns from the authz2
table, to avoid the ambiguity introduced by having two columns named
"id" in the result set.

This work is already in-progress in prod, represented by IN-8916 and
IN-8928.

Fixes https://github.com/letsencrypt/boulder/issues/6820
2023-04-24 14:59:18 -07:00
Aaron Gable 5480f1060b
Clean up database schema (#6832)
Make a series of small changes to our test database schema, both to make
it simpler to reason about and to bring it closer in alignment to our
production database schema:
- Incorporate the IssuedNamesDropIndex, Incidents, SimplePartitioning,
and NotUnique migrations into the CombinedSchema, as they have been
fully applied in prod;
- Use CHARSET=utf8mb4 everywhere, instead of just utf8;
- Use UNSIGNED for auto-increment ID columns in the tables where prod
does; and
- Re-sort the tables in CombinedSchema which no longer have foreign key
constraints.

Part of https://github.com/letsencrypt/boulder/issues/6820
2023-04-21 10:37:05 -07:00
Phil Porada 939a14544c
SA: Check MariaDB system variables at startup (#6791)
Adds a new function to the `//sa` to ensure that a MariaDB config passed
in via SA `setDefault` or via DSN perform the following validations:
1. Correct quoting for strings and string enums to prevent future
problems such as PR #6683 from occurring.
2. Each system variable we care to use is scoped as SESSION, rather than
strictly GLOBAL.
3. Detect system variables passed in that are not in a curated list of
variables we care about.
4. Validate that values for booleans, floats, integers, and strings at
least pass basic a regex.

This change is in a bit of a weird place. The ideal place for this
change would be `go-sql-driver/mysql`, but since that driver handles the
general case of MySQL-compatible connections to the database, we're
implementing this validation in Boulder instead. We're confident about
the specific versions of MariaDB running in staging/prod and that the
database vendor won't change underneath us, which is why I decided to
take this approach. However, this change will bind us tighter to MariaDB
than MySQL due to the specific variables we're checking. An up-to-date
list of MariaDB system variables can be found
[here.](https://mariadb.com/kb/en/server-system-variables/)

Fixes https://github.com/letsencrypt/boulder/issues/6687.
2023-04-18 11:02:33 -04:00
Aaron Gable 1235cbed5e
Re-remove never-used crls table (#6817)
Relands #5303, which was accidentally reverted in #5305.

Fixes https://github.com/letsencrypt/boulder/issues/6816
2023-04-17 16:00:17 -07:00
Aaron Gable 45329c9472
Deprecate ROCSPStage7 flag (#6804)
Deprecate the ROCSPStage7 feature flag, which caused the RA and CA to
stop generating OCSP responses when issuing new certs and when revoking
certs. (That functionality is now handled just-in-time by the
ocsp-responder.) Delete the old OCSP-generating codepaths from the RA
and CA. Remove the CA's internal reference to an OCSP implementation,
because it no longer needs it.

Additionally, remove the SA's "Issuers" config field, which was never
used.

Fixes #6285
2023-04-12 17:03:06 -07:00
Aaron Gable 7e994a1216
Deprecate ROCSPStage6 feature flag (#6770)
Deprecate the ROCSPStage6 feature flag. Remove all references to the
`ocspResponse` column from the SA, both when reading from and when
writing to the `certificateStatus` table. This makes it safe to fully
remove that column from the database.

IN-8731 enabled this flag in all environments, so it is safe to
deprecate.

Part of #6285
2023-04-04 15:41:51 -07:00
Aaron Gable 8c67769be4
Remove ocsp-updater from Boulder (#6769)
Delete the ocsp-updater service, and the //ocsp/updater library that
supports it. Remove test configs for the service, and remove references
to the service from other test files.

This service has been fully shut down for an extended period now, and is
safe to remove.

Fixes #6499
2023-03-31 14:39:04 -07:00
Aaron Gable 9262ca6e3f
Add grpc implementation tests to all services (#6782)
As a follow-up to #6780, add the same style of implementation test to
all of our other gRPC services. This was not included in that PR just to
keep it small and single-purpose.
2023-03-31 09:52:26 -07:00
Aaron Gable 27f0860aed
Remove precertificates.go (#6783)
This file contained both read-only and read-write methods. Its existence
is not reflected in any other gRPC or struct organization; it was easy
to forget that it exists. Merge its contents into both sa.go and
saro.go, so that the methods follow the same organization scheme as the
rest of the SA.

This makes it less likely that bugs like #6778 will happen again.
2023-03-30 17:59:11 -04:00
Aaron Gable 0d0116dd3f
Implement GetSerialMetadata on StorageAuthorityRO (#6780)
When external clients make POST requests to our ARI endpoint, they're
getting 404s even when a GET request with the same exact CertID
succeeds. Logs show that this is because the SA is returning "method
GetSerialMetadata not implemented" when the WFE attempts that gRPC
request. This is due to an oversight: the GetSerialMetadata method is
not implemented on the SQLStorageAuthorityRO object, only on the
SQLStorageAuthority object. The unit tests did not catch this bug
because they supply a mock SA, which does implement the method in
question.

Update the receiver and add a wrapper so that GetSerialMetadata is
implemented on both the read-write and read-only SA implementation
types. Add a new kind of test assertion which helps ensure this won't
happen again. Add a TODO for an integration test covering the ARI POST
codepath to prevent a regression.

Fixes #6778
2023-03-30 12:32:14 -07:00
Phil Porada ce2ee69c5f
SARO: Add sa_lag_factor metric to assess usage of the lagFactor codepath (#6774)
Add `sa_lag_retry` prometheus countervec metric with pass/fail
dimensions for `GetOrder`, `GetAuthorization2`, and `GetRegistration`
methods.

The new metrics will appear as follows:
```
sa_lag_retry{method="GetOrder",result="found"} 0
sa_lag_retry{method="GetOrder",result="notfound"} 0
sa_lag_retry{method="GetOrder",result="other"} 0
sa_lag_retry{method="GetAuthorization2",result="found"} 0
sa_lag_retry{method="GetAuthorization2",result="notfound"} 0
sa_lag_retry{method="GetAuthorization2",result="other"} 0
sa_lag_retry{method="GetRegistration",result="found"} 0
sa_lag_retry{method="GetRegistration",result="notfound"} 0
sa_lag_retry{method="GetRegistration",result="other"} 0
```

Fixes https://github.com/letsencrypt/boulder/issues/6773

---------

Co-authored-by: Samantha <hello@entropy.cat>
2023-03-30 13:48:16 -04:00
Samantha 511f5b79f1
test: Add ProxySQL to our Docker development stack (#6754)
Add an upstream ProxySQL container to our docker-compose. Configure
ProxySQL to manage database connections for our unit and integration
tests.

Fixes #5873
2023-03-29 18:41:24 -04:00
Jacob Hoffman-Andrews 85fd3ed8b7
sa: remove GetPrecertificate (#6692)
This was mostly unused. The only caller was orphan-finder, which used it
to determine if a certificate was already in the database. But this is
not particularly important functionality, so I've removed it.
2023-03-01 11:30:51 -08:00
Jacob Hoffman-Andrews d9872dbe41
sa: rename AddPrecertificateRequest.IssuerID (#6689)
sa: rename AddPrecertificateRequest.IssuerID
to IssuerNameID. This is in preparation for adding a similarly-named
field to AddSerialRequest.

Part of #5152.
2023-02-27 17:21:00 -05:00
Aaron Gable 5ce4b5a6d4
Use time format constants (#6694)
Use constants from the go stdlib time package, such as time.DateTime and
time.RFC3339, when parsing and formatting timestamps. Additionally,
simplify or remove some of our uses of parsing timestamps, such as to
set fake clocks in tests.
2023-02-24 11:22:23 -08:00
Jacob Hoffman-Andrews 8fd5861c1f
sa: quote sql_mode (#6683)
When sql_mode is set as part of a multi-variable SET command (which
happens in go-sql-driver/mysql 1.6.0+), ProxySQL can mis-parse parts of
the SET command that come after it. For instance, if we run:

SET sql_mode=STRICT_ALL_TABLES,log_queries_not_using_indexes=ON;

Then ProxySQL would mis-parse that and pass along to its upstream:

SET sql_mode=STRICT_ALL_TABLES,log_queries_not_using_indexes;

Adding quotes around sql_mode (a string-valued variables) causes
ProxySQL to parse this correctly.
2023-02-22 16:30:04 -05:00
Aaron Gable f9e4fb6c06
Add replication lag retries to some SA methods (#6649)
Add a new time.Duration field, LagFactor, to both the SA's config struct
and the read-only SA's implementation struct. In the GetRegistration,
GetOrder, and GetAuthorization2 methods, if the database select returned
a NoRows error and a lagFactor duration is configured, then sleep for
lagFactor seconds and retry the select.

This allows us to compensate for the replication lag between our primary
write database and our read-only replica databases. Sometimes clients
will fire requests in rapid succession (such as creating a new order,
then immediately querying the authorizations associated with that
order), and the subsequent requests will fail because they are directed
to read replicas which are lagging behind the primary. Adding this
simple sleep-and-retry will let us mitigate many of these failures,
without adding too much complexity.

Fixes #6593
2023-02-14 17:25:13 -08:00
Jacob Hoffman-Andrews e57c788086
Add checking of validations to cert-checker (#6617)
This includes two feature flags: one that controls turning on the extra
database queries, and one that causes cert-checker to fail on missing
validations. If the second flag isn't turned on, it will just emit error
log lines. This will help us find any edge conditions we need to deal
with before making the new code trigger alerts.

Fixes #6562
2023-02-03 16:25:41 -05:00
Phil Porada c0e158ed93
Limit input fields during new authz creation in sa.NewOrderAndAuthz (#6622)
A `core.Authorization` object has lots of fields (e.g. `status`, 
`attempted`, `attemptedAt`) which are not relevant to a 
newly-created authorization: a brand new authz can only be in 
the "pending" state, cannot have been attempted already or have 
been validated.

Fix a nil pointer dereference in `sa.NewOrderAndAuthzs` if a 
`req *sapb.NewOrderAndAuthzsRequest` is passed into the 
function with an inner nil `req.NewOrder`.

Add new tests. 
- TestNewOrderAndAuthzs_MissingInnerOrder 
  - Checks that
the nil pointer dereference no longer occurs 
- TestNewOrderAndAuthzs_NewAuthzExpectedFields 
  - Checks that the `Attempted`, `AttemptedAt`, `ValidationRecords`,
     and `ValidationErrors` fields for a brand new authz in the 
    `pending` state are correctly defaulted to `nil` in 
    `sa.NewOrderAndAuthzs`.

Add a new test assertion `AssertBoxedNil` that returns true for the
existence of a "boxed nil" - a nil value wrapped in a non-nil interface
type.

Fixes #6535

---------

Co-authored-by: Samantha <hello@entropy.cat>
2023-02-03 15:38:51 -05:00
Jacob Hoffman-Andrews 074ecf3bd4
Improve MultiInserter (#6572)
Add validation of input parameters as unquoted MariaDB identifiers, and
document the regex that does it.

Accept a narrower interface (Queryer) for `Insert()`.

Take a list of fields rather than a string containing multiple fields,
to make validation simpler. Rename retCol to returningColumn.

Document safety properties and requirements.
2023-01-30 09:45:45 -08:00
Aaron Gable d9cb35c60c
Remove unused DBConnect config string (#6615)
Neither our testing, staging, nor production configs use the
DBConfig.DBConnect config value. Remove it.

To connect to a database, you have to provide a connection URL. These
URLs often contain sensitive information such as DB usernames and
passwords, so we don't store them directly in our configs -- instead, we
store paths to files which contain these strings, and provision those
files via a separate mechanism. We maintained the ability to provide a
URL directly in the config for the sake of easy testing, but have not
used it for that purpose for some time now.
2023-01-27 13:10:52 -08:00
Phil Porada 26e5b24585
dependencies: Replace square/go-jose.v2 with go-jose/go-jose.v2 (#6598)
Fixes #6573
2023-01-24 12:08:30 -05:00
Jacob Hoffman-Andrews 994e9d3d0b
Add duplicate serial to bad-key-revoker unittest (#6543)
This tests the fix in #6531.
2023-01-18 11:28:24 -08:00
Jacob Hoffman-Andrews 4be76afcaf
Extract out `db.QuestionMarks` function (#6568)
We use this pattern in several places: there is a query that needs to
have a variable number of placeholders (question marks) in it, depending
on how many items we are inserting or querying for. For instance, when
issuing a precertificate we add that precertificate's names to the
"issuedNames" table. To make things more efficient, we do that in a
single query, whether there is one name on the certificate or a hundred.
That means interpolating into the query string with series of question
marks that matches the number of names.

We have a helper type MultiInserter that solves this problem for simple
inserts, but it does not solve the problem for selects or more complex
inserts, and we still have a number of places that generate their
sequence of question marks manually.

This change updates addIssuedNames to use MultiInserter. To enable that,
it also narrows the interface required by MultiInserter.Insert, so it's
easier to mock in tests.

This change adds the new function db.QuestionMarks, which generates e.g.
`?,?,?` depending on the input N.

In a few places I had to rename a function parameter named `db` to avoid
shadowing the `db` package.
2023-01-10 14:29:31 -08:00
Phil Porada cfa524a7a1
Deprecate StoreRevokerInfo flag (#6567)
Fixes #5238
2023-01-09 11:42:23 -08:00
Jacob Hoffman-Andrews 8d43397d1a
Update comment in sa/saro.go (#6564)
The comment references a method that no longer exists.
2023-01-05 14:14:23 -08:00
Jacob Hoffman-Andrews 4a348feb4e
db: remove unique indexes on some tables (#6519)
We use partitioning to be able to clean up old data, and partitioning is
incompatible with unique indexes. We still have a unique index on
`serial`, and these tables are downstream from there. There still may
some duplicates, like when a certificate is treated as orphaned but was
actually successfully added to the DB; when we later go to incorporate
it a duplicate will show up.

This reflects changes already made in prod.

This PR removes a unittest that coincidentally relied on these indexes
to generate an error case it needed: `TestAddPrecertificateStatusFail`.
That test was added in #5918. We can bring that test back with a
significant refactoring to change `*db.WrappedMap` to an interface, but
in the meantime we're prioritizing landing this PR so we have a more
realistic integration test environment.
2022-12-05 16:24:05 -08:00
Aaron Gable f089aa5d5f
SA.GetOrder: conduct queries inside a transaction (#6541)
The SA's GetOrder method issues four separate database queries:
- get the Order object itself
- get the list of Names associated with it
- get the list of Authorization IDs associated with it
- get the contents of those Authorizations themselves

These four queries can hit different database replicas with different
amounts of replication lag, and therefore different views of the
universe. This can result in inconsistent results, such as the Order
existing, but not having any Authorization IDs associated with it.

Conduct these four queries all within a single transaction, to ensure
they all hit a single read-replica with a consistent view of the world.

Fixes #6540
2022-12-05 12:09:57 -08:00
Aaron Gable d8d5a030f4
SA: Remove NewOrder and NewAuthorizations2 (#6536)
Delete the NewOrder and NewAuthorizations2 methods from the SA's gRPC
interface. These methods have been replaced by the unified
NewOrderAndAuthzs method, which performs both sets of insertions in a
single transaction.

Also update the SA and RA unittests to not rely on these methods for
setting up test data that other functions-under-test rely on. In most
cases, replace calls to NewOrder with calls to NewOrderAndAuthzs. In the
SA tests specifically, replace calls to NewAuthorizations2 with a
streamlined helper function that simply does the single necessary
database insert.

Fixes #6510
Fixes #5816
2022-12-02 14:34:35 -08:00
Aaron Gable ba34ac6b6e
Use read-only SA clients in wfe, ocsp, and crl (#6484)
In the WFE, ocsp-responder, and crl-updater, switch from using
StorageAuthorityClients to StorageAuthorityReadOnlyClients. This ensures
that these services cannot call methods which write to our database.

Fixes #6454
2022-12-02 13:48:28 -08:00
Aaron Gable b7e4e9d0ce
SA: Remove AddCertificate's unused return value (#6532)
The `digest` value in AddCertificate's response message is never used by
any callers. Remove it, replacing the whole response message with
google.protobuf.Empty, to mirror the AddPrecertificate method.

This swap is safe, because message names are not sent on the network,
and empty message fields are omitted from the wire format entirely, so
sending the predefined Empty message is identical to sending an empty
AddCertificateResponse message. Since no client is inspecting the
response to access the digest field, sending an empty response will not
break any clients.

Fixes #6498
2022-11-30 13:00:56 -08:00
J.C. Jones 6805c39580
Fix typo for omitZero in database.go from PR #6492 (#6515)
We have a new method, omitZero, which is to allow `max_statement_time`
and `long_query_time` to be zeroed out, but copy/paste error left
`long_query_time` out. Fix it.
2022-11-17 13:27:57 -08:00
Aaron Gable 992ae61439
SA.NewOrder: Get order status inside transaction (#6512)
The SA's NewOrder and NewOrderAndAuthzs methods both write new rows (the
order itself, new authorizations, and new OrderToAuthz relations) to the
database, and then quickly turn around and query the database for a
bunch of authz rows to compute the status of the new order. This is
necessary because many orders are created already referencing existing
authorizations, and the state of those authorizations is not known at
order creation time, but does affect the order's status.

Due to replication lag, it can cause issues if the database writes go to
the primary database, but the follow-up read goes to a read-only
database replica which may be lagging slightly. To prevent this issue,
conduct the reads on the same transaction as the writes.

Fixes #6511
2022-11-16 16:53:23 -08:00
Samantha 76b2ec0702
SA: Standardize methods which use COUNT queries (#6505)
- Replace `-1` in return values with `0`. No callers were depending on
`-1`.
- Replace `count(` with `COUNT(` for the sake of readability.
- Replace `COUNT(1)` with `COUNT(*)` (https://mariadb.com/kb/en/count).
Both
  versions provide identical outputs but let's standardize on the docs.

Fixes #6494
2022-11-14 18:10:32 -08:00
Aaron Gable 4f473edfa8
Deprecate 10 feature flags (#6502)
Deprecate these feature flags, which are consistently set in both prod
and staging and which we do not expect to change the value of ever
again:
- AllowReRevocation
- AllowV1Registration
- CheckFailedAuthorizationsFirst
- FasterNewOrdersRateLimit
- GetAuthzReadOnly
- GetAuthzUseIndex
- MozRevocationReasons
- RejectDuplicateCSRExtensions
- RestrictRSAKeySizes
- SHA1CSRs

Move each feature flag to the "deprecated" section of features.go.
Remove all references to these feature flags from Boulder application
code, and make the code they were guarding the only path. Deduplicate
tests which were testing both the feature-enabled and feature-disabled
code paths. Remove the flags from all config-next JSON configs (but
leave them in config ones until they're fully deleted, not just
deprecated). Finally, replace a few testdata CSRs used in CA tests,
because they had SHA1WithRSAEncryption signatures that are now rejected.

Fixes #5171 
Fixes #6476
Part of #5997
2022-11-14 09:24:50 -08:00
Aaron Gable 9e67423110
Create new StorageAuthorityReadOnly gRPC service (#6483)
Create a new gRPC service named StorageAuthorityReadOnly which only
exposes a read-only subset of the existing StorageAuthority service's
methods.

Implement this by splitting the existing SA in half, and having the
read-write half embed and wrap an instance of the read-only half.
Unfortunately, many of our tests use exported read-write methods as part
of their test setup, so the tests are all being performed against the
read-write struct, but they are exercising the same code as the
read-only implementation exposes.

Expose this new service at the SA on the same port as the existing
service, but with (in config-next) different sets of allowed clients. In
the future, read-only clients will be removed from the read-write
service's set of allowed clients.

Part of #6454
2022-11-09 11:09:12 -08:00
Jacob Hoffman-Andrews ee1afbb988
SA: Enable overriding max_statement_time in DSN (#6492)
Also sql_mode and long_query_time.
2022-11-07 15:42:58 -08:00